History of Previous Talks
Unsupervised Machine Learning for Matrix Decomposition
Date: January 9, 2017
Abstract: Unsupervised learning is a classical approach in pattern recognition and data analysis. Its importance is growing today, due to the increasing data volumes and the difficulty of obtaining statistically sufficient amounts of labelled training data. Typical analysis techniques using unsupervised learning are principal component analysis, independent component analysis, and cluster analysis. They can all be presented as decompositions of the data matrix containing the unlabeled samples. Starting from the classical results, the author reviews some advances in the field up to the present day.
Speaker: Erkki Oja
Affiliation: Professor Emeritus, Aalto University
Place of Seminar: Aalto University
Probabilistic Programming: Bayesian Modeling Made Easy
Date: January 16, 2017
Abstract: Probabilistic models are principled tools for understanding data, but difficulty of inference limits the complexity of models we can actually use. Often we need to develop specific inference algorithms for new models (which might take months), and need to restrict ourselves to tractable model families that might not match our beliefs about the data. Probabilistic programming promises to fix this, by separating the model description from the inference: With probabilistic programming languages we can specify complex models using a high-level programming language, letting a black-box inference engine take care of the tricky details. This talk covers the basic idea of probabilistic programming and discusses how well its promises hold now and in the future.
Speaker: Arto Klami
Affiliation: Academy Research Fellow, University of Helsinki
Place of Seminar: University of Helsinki
Metabolite Identification Through Machine Learning
Date: January 23, 2017
Abstract: Identification of small molecules from biological samples remains a major bottleneck in understanding the inner working of biological cells and their environment. Machine learning on data from large public databases of tandem mass spectrometric data has transformed this field in recent years, witnessing an increase of identification rates by 150%. In this presentation, I will outline the key machine learning methods behind this development: kernel-based learning of molecular fingerprints, multiple kernel learning, structured prediction as well as some recent advances.
Speaker: Juho Rousu
Affiliation: Associate Professor, Aalto University
Place of Seminar: Aalto University
Likelihood-free Inference and Predictions for Computational Epidemiology
Date: January 30, 2017
Abstract: Simulator-based models often allow inference and predictions under more realistic assumptions than those employed in standard statistical models. For example, the observation model for an underlying stochastic process can be more freely chosen to reflect the characteristics of the data gathering procedure. A major obstacle for such models is the intractability of the likelihood, which has to a large extent hampered their practical applicability. I will discuss recent advances in likelihood-free inference that greatly accelerate the model fitting process by exploiting a combination of machine learning techniques. Applications to several novel models in infectious disease epidemiology are used to illustrate the potential offered by this approach.
Speaker: Jukka Corander
Affiliation: Professor, University of Helsinki and University of Oslo
Place of Seminar: University of Helsinki
Towards Perfect Density Estimation
Date: February 6, 2017
Abstract: We start by addressing a most simple problem, estimation of a one dimensional density function, and argue that despite of the apparent simplicity of the problem, it is surprisingly difficult to solve it in a holistic manner that is both computationally feasible and theoretically justifiable without strong distributional or other assumptions. We demonstrate how the information-theoretic MDL framework can be used for reaching this goal (almost) perfectly, and show how this simple setup gives interesting perspectives on the fundamental concepts in probabilistic modelling and statistical inference. We also discuss ideas for extending the framework to more complex models with additional practical applications.
Speaker: Petri Myllymäki
Affiliation: Professor, University of Helsinki
Place of Seminar: Aalto University
Variable Selection From Summary Statistics
Date: February 13, 2017
Abstract: With increasing capabilities to measure a massive number of variables, efficient variable selection methods are needed to improve our understanding of the underlying data generating processes. This is evident, for example, in human genomics, where genomic regions showing association to a disease may contain thousands of highly correlated variants, while we expect that only a small number of them are truly involved in the disease process. I outline recent ideas that have made variable selection practical in human genomics and demonstrate them through our experiences with the FINEMAP algorithm (Benner et al. 2016, Bioinformatics).
(1) Compressing data to light-weight summaries to avoid logistics and privacy concerns related to complete data sharing and to minimize the computational overhead.
(2) Efficient implementation of sparsity assumptions.
(3) Efficient stochastic search algorithms.
(4) Use of public reference databases to complement the available summary statistics.
Speaker: Matti Pirinen
Affiliation: Academy Research Fellow, Institute for Molecular Medicine Finland, University of Helsinki
Place of Seminar: University of Helsinki
Compressed Sensing for Semi-Supervised Learning From Big Data Over Networks
Date: February 20, 2017
Abstract: In this talk I will present some of our most recent work on the application of compressed sensing to semi-supervised learning from massive network-structured datasets, i.e., big data over networks. We expect the user of compressed sensing ideas to be game-changing for machine learning from big data in a similar manner as it was for digital signal processing. In particular, I will present a sparse label propagation algorithm which efficiently learn from large amounts of network-structured unlabeled data by leveraging the information provided by a few initially labelled training data points. This algorithm is inspired by compressed sensing recovery methods and allows for a simple sufficient condition on the network structure which guarantees accurate learning.
Speaker: Alexander Jung
Affiliation: Assistant Professor, Aalto University
Place of Seminar: Aalto University
Inverse Modeling in Behavioral Sciences and HCI
Date: February 27, 2017
Abstract: Can one make deep inferences about a person based only on observations of how she acts? I discuss methodology for inverse modeling in behavioral sciences, where the goal is to estimate a cognitive model from limited behavioral data. Given substantial diversity in people’s intentions, strategies and abilities, this is a difficult problem and previously unaddressed. I discuss advances achieved with an approach that combines (1) computational rationality, to predict how a person adapts to a task when her capabilities are known, and (2) Approximate Bayesian Computation (ABC) to estimate those capabilities. The benefit is that model parameters are conditioned on both prior knowledge and observations, which improves model validity and helps identify causes for observations. Inverse modeling methods can advance theory-formation by bringing complex behavior within reach of modeling. This talk is based on on-going collaborations with Antti Kangasraasio, Samuel Kaski, Jukka Corander, Andrew Howes, Kumaripaba Athukorala, Jussi Jokinen, Sayan Sarcar, and Xiangshi Ren.
Speaker: Antti Oulasvirta
Affiliation: Associate Professor, Aalto University
Place of Seminar: University of Helsinki
Differentially Private Bayesian Learning
Date: March 6, 2017
Abstract: Many applications of machine learning for example in health care would benefit from methods that can guarantee data subject privacy. Differential privacy has recently emerged as a leading framework for private data analysis. Differenctial privacy guarantees privacy by requiring that the results of an algorithm should not change much even if one data point is changed, thus providing plausible deniability for the data subjects.
In this talk I will present methods for efficient differentially private Bayesian learning. In addition to asymptotic efficiency, we will focus on how to make the methods efficient for moderately-sized data sets. The methods are based on perturbation of sufficient statistics for exponential family models and perturbation of gradients for variational inference. Unlike previous state-of-the-art, our methods can predict drug sensitivity of cancer cell lines using differentially private linear regression with better accuracy than using a very small non-private data set.
Speaker: Antti Honkela
Affiliation: Assistant Professor, University of Helsinki
Place of Seminar: Aalto University
Small Data AUC Estimation of Machine Learning Methods: Pitfalls and Remedies
Date: March 13, 2017
Abstract: Asking whether two populations can be distinguished from each other is one of the most fundamental questions in data analysis and area under ROC curve (AUC) is one of the simplest and most practical tools for answering it. Also known as the Wilcoxon-Mann-Whitney U statistic, it can be associated with a p-value indicating how likely one would obtain as good AUC value if the two populations would not be stochastically different. Estimating AUC of a predictive model and its statistical significance has a huge practical importance in fields like medicine, where one often has access to only small amounts of labeled data but large number of features. Leave-pair-out cross-validation (LPOCV) is an almost unbiased AUC estimator of machine learning methods that has also been empirically shown to be the most reliable of the cross-validation (CV) based estimators. We further study the properties of LPOCV and show some serious pitfalls one can encounter when estimating AUC with CV and how to avoid them. In particular, we show how one can produce very promising results with high AUC values even if there is no signal in the data. Finally, we show how to counter these risks with new Wilcoxon–Mann–Whitney U type of permutation tests adjusted for LPOCV, thus upgrading one of the classical statistical tools for CV estimates.
Speaker: Tapio Pahikkala
Affiliation: Assistant Professor, University of Turku
Place of Seminar: University of Helsinki
Future AI: Autonomous machine learning and beyond
Date: March 20, 2017
Abstract: Many researchers have identified autonomous machine learning (unsupervised, semi-supervised and reinforcement learning) as an important cornerstone of advanced artificial intelligence. The Curious AI Company is developing such autonomous learning systems. We already have state-of-the-art results in several semi-supervised classification tasks but we are also working on bringing autonomy to learning segmentation and hierarchical control, both of them tasks that take a lot of human work when developing for instance self-driving cars. However, we believe there’s an even more important blocker on the way to advanced AI: the fundamental inability of currently used parallel distributed neural coding to properly represent objects and their interactions. We are working on deep learning networks whose neuro-symbolic representations will hopefully allow neural networks to understand the world not only in terms of a collection of features but in terms of objects and their interactions, too. This is necessary for many tasks such as communication, reasoning and complex decision making.
Speaker: Harri Valpola
Affiliation: CEO of the Curious AI Company
Place of Seminar: Aalto University
Learning to Rank: Applications to Bioinformatics
Date: March 27, 2017
Abstract: Learning To Rank (LTR) has been developed in information retrieval for ranking documents regarding the relevance to a given query. Typically LTR builds a ranking model from given relevant (or irrelevant) query-document pairs. Generally, in some respect, LTR can be thought as an attempt to solve a multilabel classification problem, where queries are labels. A lot of settings in bioinformatics can be turned into multilabel classification problems having relatively similar properties. One typical example is biomedical document annotation. Currently PubMed, a database of 26 million biomedical citations, has around 30,000 keywords, called MeSH (Medical Subject Headings) terms, i.e. labels in multilabel classification, where the number of articles per MeSH term is extremely diverse, ranging from only 20 to more than eight million. This large, biased dataset already goes beyond the general sense of settings expected by regular multilabel classifiers. In this talk, I will start with introduction and a brief review of LTR. I then raise three bioinformatics multilabel classification problems that share real data-derived, practical properties, which hamper the application of regular multilabel classifiers. Finally I will show that LTR nicely addresses such large-scale, challenging bioinformatics multilabel classification problems.
A large portion of this talk appeared in ISMB in 2015 and 2016.
Speaker: Hiroshi Mamitsuka
Affiliation: Professor, Kyoto University
Place of Seminar: University of Helsinki
Multilayer Networks
Date: April 3, 2017
Abstract: Network science has been very successful in investigations of a wide variety of applications from biology and the social sciences to physics, technology, and more. In many situations, it is already insightful to use a simple (and typically naive) representation as a simple, binary graph in which nodes are entities and unweighted edges encapsulate the interactions between those entities. This allows one to use the powerful methods and concepts for example from graph theory, and numerous advances have been made in this way. However, as network science has matured and (especially) as ever more complicated data has become available, it has become increasingly important to develop tools to analyse more complicated structures. For example, many systems that were typically initially studied as simple graphs are now often represented as time-dependent networks, networks with multiple types of connections, or interdependent networks. This has allowed deeper and more realistic analyses of complex networked systems, but it has simultaneously introduced mathematical constructions, jargon, and methodology that are specific to research in each type of system. Recently, the concept of “multilayer networks” was developed in order to unify the aforementioned disparate language (and disparate notation) and to bring together the different generalised network concepts that included layered graphical structures. In this talk, I will introduce multilayer networks and discuss how to study their structure. Generalisations of the clustering coefficient for multiplex networks and graph isomorphism for general multilayer networks are used as illustrative examples.
Speaker: Mikko Kivelä
Affiliation: Postdoctoral Researcher, Aalto University
Place of Seminar: Aalto University
Learning With Spectral Kernels
Date: April 10, 2017
Abstract: Machine learning algorithms learn models that automatically infer data representations and generalise into new data. Gaussian processes are Bayesian kernel-based models with a key advantage of being able to efficiently learn kernel functions from data. All kernel functions can be decomposed into sinusoidal components, which provide a highly expressive basis for learning arbitrary representations. In this talk I will discuss how we can exploit spectral kernel learning for large-scale multi-task learning. We also generalise spectral learning into learning non-stationary kernels with input-specific behavior.
Speaker: Markus Heinonen
Affiliation: Department of Computer Science, Aalto University
Place of Seminar: University of Helsinki
Nintendo Wii Fit-Based Balance Testing to Detect Sleep Deprivation: Approximate Bayesian Computation Approach
Date: April 24, 2017
Abstract: Sleep deprivation deteriorates health and causes accidents. Measuring a person’s postural steadiness may be used to determine his/hers state of alertness. Posturographic measurements are easy to conduct: a person’s body sway is measured during upright stance on a balance board for 60 s. The Nintendo Wii Fit balance board is a portable and affordable alternative to expensive clinical force plates. Body sway may be modeled with a single-link inverted pendulum (Asai et al. 2009). The model parameters, such as time delay and noise intensity in the nervous system, are physiologically relevant. The pendulum is kept upright with controllers, that include stiffness and damping gain parameters. Level of control determines how often the active controller is ON. The model cannot be solved analytically in closed form. Therefore, inferring model parameters and their confidence limits is nontrivial. We used sequential Monte Carlo approximate Bayesian computation (SMC-ABC) algorithm to infer the model parameters. The inferred parameters may allow determining a person’s state of alertness.
Speaker: Aino Tietäväinen
Affiliation: Department of Physics, University of Helsinki
Place of Seminar: Aalto University
Empirical Parameterization of Exploratory Search Systems Based on Bandit Algorithms
Date: May 8, 2017
Abstract: Exploratory searches are where a user has insufficient knowledge to define exact search criteria or does not otherwise know what they are looking for. Reinforcement learning techniques have demonstrated great potential for supporting exploratory search in information retrieval systems as they allow the system to trade-off exploration (presenting the user with alternatives topics) and exploitation (moving toward more specific topics). Users of such systems, however, often feel that the system is not responsive to user needs. This problem is not an inherent feature of such systems, but is caused by the exploration rate parameter being inappropriately tuned for a given system, dataset or user. In this talk, we discuss two approaches how to optimise exploratory search systems based on bandit algorithms. First, we show that the tradeoff between exploration and exploitation can be modelled as a direct relationship between the exploration rate parameter from the reinforcement learning algorithm and the number of relevant documents returned to the user over the course of a search session. We define the optimal exploration/exploitation trade-off as where this relationship is maximised and show this point to be broadly concordant with user satisfaction and performance. Our second approach aims to dynamically adapt exploration and exploitation in a manner commensurate with the user’s individual requirements for each search session. We present a novel study design together with a regression model for predicting the optimal exploration rate based on simple metrics from the first iteration, such as clicks and reading time. We perform model selection based on the data collected from a user study and show that predictions are consistent with user feedback.
Speaker: Dorota Glowacka
Affiliation: Department of Computer Science, University of Helsinki
Place of Seminar: University of Helsinki
Machine Learning for Image-Based Localization
Date: May 15, 2017
Abstract: Image-based localization refers to a problem where the camera position and orientation for a given query image is computed with respect to a known visual 3D map of the scene. This problem is relevant for applications such as robot self-localization, pedestrian navigation, and augmented reality. Another related problem is the relative pose estimation between two camera views which is required for computing image-based 3D models from a collection of 2D images. Traditionally both of these problems have been approached by using hand-crafted local image features and descriptors, such as the widely used SIFT keypoint detector. However, recently several deep learning based localization approaches have been proposed. They omit local feature matching and directly try to regress the camera pose. In this presentation, we will describe an overview of the problem area and explore some recent deep learning based approaches. We will also present some of our own recent results in this area.
Speaker: Juho Kannala
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
On Priors and Bayesian Variable Selection in Large p, Small n Regression
Date: May 22, 2017
Abstract: The Bayesian approach is well known for using priors to improve inference, but equally important part is the integration over the uncertainties. I first present recent development in hierarchical shrinkage priors for presenting sparsity assumptions in covariate effects. I then present a projection predictive variable selection approach, which is a Bayesian decision theoretical approach for variable selection which can preserve the essential information and uncertainties related to all variables in the study. I also present recent excellent experimental results and easy to use software.
Speaker: Aki Vehtari
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: University of Helsinki
Graphics Meets Vision Meets Machine Learning
Date: May 29, 2017
Abstract: Realistic three-dimensional modeling and animation are key bottlenecks in the production film, games, VR, and other applications of computer graphics. In this talk, I will describe our recent research that makes use of machine learning techniques for solving hard inference problems for generating 3D content: capture and reproduction of photorealistic surface appearance, facial performance capture, and turning audio into facial animation. These works both push the state of the art forward in research – two of the three projects have been published at ACM SIGGRAPH – and are surprisingly ready for production use already now.
Speaker: Jaakko Lehtinen
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
Statistical Ecology with Gaussian Processes
Date: June 5, 2017
Abstract: Ecology studies the distribution and abundance of species, and their interactions with other species and the environment. Key questions in ecology include what are the environmental factors and interspecies dependencies that drive species distributions, how these processes together affect species community structures and how environmental changes, such as climate change, affect species distribution and species communities. These questions are essentially about variable selection and causal and predictive inference. Hence, statistics has a central role in answering them. The species distribution models (SDMs) used for these analyses are traditionally based on generalized linear and additive models. In this talk I will present how Gaussian processes (GPs) can be used in SDMs and what benefits and challenges this provides. I will present recent results on GP based species distribution modeling in the Baltic Sea and Great Barrier Reef, Australia. I will discuss the potential future development and current challenges related to computation and model building.
Speaker: Jarno Vanhatalo
Affiliation: Professor of Statistics, University of Helsinki
Place of Seminar: University of Helsinki
Learning Data Representation by Large-Scale Neighbor Embedding
Date: June 12, 2017
Abstract: Machine learning, the state-of-the-art data science, has been increasingly influencing our life. Encoding data in a suitable vector space is the fundamental starting point for machine learning. A good vector coding should respect the relations among the data items. However, conventional methods that preserve pairwise or higher order relationship are very slow and consequently they can handle only small-scale data sets. We have been developing a family of unsupervised methods called large-scale Neighbor Embedding (NE) which substantially accelerate the vector coding. Our method can thus learn low-dimensional vector representation for mega-scale data according to their neighborhoods in the original space. With our efficient algorithms and a wealth of neighborhood information, Neighbor Embedding significantly outperforms small-scale NE and many other existing approaches for learning data representation. Besides generic feature extraction, our work also delivers two important tools as special cases of Neighbor Embedding for data visualization and cluster analysis, which scales up these applications by an order of magnitude and enables the current-sized visualization and clustering for interactive use. Because neighborhood information is naturally and massively available in many areas, our method has wide applications as a critical component in scientific research, next-generation DNA sequence analysis, natural language processing, educational cloud, financial data analysis, market studies, etc.
Speaker: Zhirong Yang
Affiliation: Department of Computer Science, Aalto University
Place of Seminar: Aalto University
Latent Stochastic Models for Comparing Tumor Samples of Unknown Purity
Date: September 11, 2017
Abstract: A challenge in analyzing data from tumor samples is that the biopsies contain an mixture of various cells, including cancer cells, immune cells and stromal cells. This hinders the discovery of clinically relevant information and can lead to systematically biased results. A few recent analysis techniques control for such factors, but only accommodate specific types of data, or require controls which cannot be obtained from each patient. I will present our developments on statistical methods for controlling for the latent and varying fraction of tumor cells in next-generation methylation and RNA sequencing data, which aim to enable unbiased and more accurate comparison of patient-derived samples.
Speaker: Antti Häkkinen
Affiliation: Postdoctoral fellow, Genome-Scale Biology Program, Faculty of Medicine, University of Helsinki
Place of Seminar: Aalto University
Fast Nearest Neighbor Search in High Dimensions by Multiple Random Projection Trees
Date: September 18, 2017
Abstract: Efficient index structures for fast approximate nearest neighbor queries are required in many applications such as recommendation systems. In high-dimensional spaces, many conventional methods suffer from excessive usage of memory and slow response times. We propose a method where multiple random projection trees are combined. We demonstrate by extensive experiments on a wide variety of data sets that the method is faster than existing partitioning tree or hashing based approaches, making it the fastest available technique on high accuracy levels.
Speaker: Teemu Roos
Affiliation: Associate Professor, Department of Computer Science, University of Helsinki
Place of Seminar: University of Helsinki
Hyvönen et al., “Fast Nearest Neighbor Search through Sparse Random Projections and Voting”, IEEE Big Data Conference 2016: [link]
Computational creativity and machine learning
Date: September 25, 2017
Abstract: Computational creativity has been defined as the art, science, philosophy and engineering of computational systems which, by taking on particular responsibilities, exhibit creative behaviours. In this talk I first try to elaborate on what creative responsibilities could be and why they are interesting. I then outline ways in which machine learning can be used to take on some of these responsibilities, helping computational systems become more creative.
Speaker: Hannu Toivonen
Affiliation: Professor of Computer Science, University of Helsinki
Place of Seminar: Aalto University
Machine learning for Materials Research
Date: October 2, 2017
Abstract: In materials research, we have learnt to predict the evolution of microstructure starting with the atomic level processes. We know about defects — point and extended, — and we know that these can be crucial for the final structural (and related mechanical and electrical) properties. Often simple macroscopic differential equations, which are used for the purpose, fail to predict simple changes in materials. Many questions remain unanswered. Why a ductile material suddenly becomes brittle? Why a strong concrete bridge suddenly cracks and eventually collapses after serving for tens of years? Why the wall of high quality steels in fission reactors suddenly crack? Or, why the clean smooth surface roughens under applied electric fields? All these questions can be answered, if one peeks in to atom’s behavior imagining it jumping inside the material. But how the atoms “choose” where to jump amongst the numerous possibilities in complex metals? Tedious parameterization can help to deal with the problem, but machine learning can provide a better and more elegant solution to this problem.
In my presentation, I will explain the problem at hand and show a few examples of former and current application of Neural Network for calculating the barriers for atomic jumps with the analysis of how well the applied NN worked.
Speaker: Flyura Djurabekova
Affiliation: Department of Physics, University of Helsinki
Place of Seminar: University of Helsinki
Does my algorithm work?
Date: October 9, 2017
Abstract: It is easy to propose a new algorithm for solving a Machine Learning problem. It is much harder to convince other people that the proposed algorithm actually works. The “gold standard” of tight theoretical guarantees is often out of reach. So what do we do? Typically, an algorithm is validated on a couple of test problems and its output is compared with that of algorithms that are known to work. This is not a great strategy.
In this talk, I will outline a general strategy for assessing whether an algorithm for approximate Bayesian computing works on a given problem. This method does not require evaluation of the true posterior and also indicates ways in which the computed posterior systematically deviates from the true posterior.
Speaker: Daniel Simpson
Affiliation: Professor of Stastical Sciences, University of Toronto
Place of Seminar: Aalto University
Probabilistic preference learning with the Mallows rank model
Date: October 16, 2017
Abstract: Ranking and comparing items is crucial for collecting information about preferences in many areas, from marketing to politics. The Mallows rank model is among the most successful approaches to analyze rank data, but its computational complexity has limited its use to a form based on Kendall distance. Here, new computationally tractable methods for Bayesian inference in Mallows models are developed that work with any right-invariant distance. The method performs inference on the consensus ranking of the items, also when based on partial rankings, such as top-k items or pairwise comparisons. When assessors are many or heterogeneous, a mixture model is proposed for clustering them in homogeneous subgroups, with cluster-specific consensus rankings. Approximate stochastic algorithms are introduced that allow a fully probabilistic analysis, leading to coherent quantification of uncertainties. The method can be used, for example, for making probabilistic predictions on the class membership of assessors based on their ranking of just some items, and for predicting missing individual preferences, as needed in recommendation systems.
Speaker: Elja Arjas
Affiliation: Professor (emeritus) of Mathematics and Statistics, University of Helsinki
Place of Seminar: University of Helsinki
Computational Challenges in Analyzing And Moderating Online Social Discussions
Date: October 23, 2017
Abstract: Online social media are a major venue of public discourse today, hosting the opinions of hundreds of millions of individuals. Social media are often credited for providing a technological means to break information barriers and promote diversity and democracy. In practice, however, the opposite effect is often observed: users tend to favor content that agrees with their existing world-view, get less exposure to conflicting viewpoints, and eventually create “echo chambers” and increased polarization. Arguably, without any kind of moderation, current social-media platforms gravitate towards a state in which net-citizens are constantly reinforcing their existing opinions. In this talk we present a ongoing line of work on analyzing and moderating online social discussions. We first consider the questions of detecting controversy using network structure and content, tracking the evolution of polarized discussions, and understanding their properties over time. We then address the problem of designing algorithms to break filter bubbles and reduce polarization. We discuss a number of different strategies such as user and content recommendation, as well as viral approaches.
Speaker: Aristides Gionis
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
Learning Markov Equivalence Classes of Directed Acyclic Graphs: an Objective Bayes Approach
Date: October 30, 2017
Abstract: A Markov equivalence class contains all the Directed Acyclic Graphs (DAGs) encoding the same conditional independencies, and is represented by a Completed Partially Directed DAG (CPDAG), also named Essential Graph (EG). We approach the problem of model selection among noncausal sparse Gaussian DAGs by directly scoring EGs, using an objective Bayes method. Specifically, we construct objective priors for model selection based on the Fractional Bayes Factor, leading to a closed form expression for the marginal likelihood of an EG. Next we propose an MCMC strategy to explore the space of EGs, possibly accounting for sparsity constraints, and illustrate the performance of our method on simulation studies, as well as on a real dataset. Our method is fully Bayesian and thus provides a coherent quantification of inferential uncertainty, requires minimal prior specification, and shows to be competitive in learning the structure of the data-generating EG when compared to alternative state-of-the-art algorithms.
Speaker: Guido Consonni
Affiliation: Professor of Statistics, Universita Cattolica del Sacro Cuore
Place of Seminar: University of Helsinki
Efficient and accurate approximate Bayesian computation
Date: November 6, 2017
Abstract: Approximate Bayesian computation (ABC) is a method for calculating a posterior distribution when the likelihood is intractable, but simulating the model is feasible. It has numerous important applications, for example in computational biology, material physics, user interface design, etc. However, many ABC algorithms require a large number of simulations, which can be costly. To reduce the cost, Bayesian optimisation (BO) and surrogate models such as Gaussian processes have been proposed. Bayesian optimisation enables deciding intelligently where to simulate the model next, but standard BO approaches are designed for optimisation and not for ABC. Here we address this gap in the existing methods. We model the uncertainty in the ABC posterior density which is due to a limited number of simulations available, and define a loss function that measures this uncertainty. We then propose to select the next model simulation to minimise the expected loss. Experiments show the proposed method is often more accurate than the existing alternatives.
Speaker: Pekka Marttinen
Affiliation: Academy Research Fellow, Department of Computer Science, Aalto University
Place of Seminar: Aalto University
Learning of Ultra High-Dimensional Potts Models for Bacterial Population Genomics
Date: November 13, 2017
Abstract: The potential for genome-wide modeling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has earlier been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10000-100000 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here we introduce a novel inference method (SuperDCA) which employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 100000 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA thus holds considerable potential in building understanding about numerous organisms at a systems biological level.
Speaker: Jukka Corander
Affiliation: Professor of Statistics, University of Helsinki and University of Oslo
Place of Seminar: University of Helsinki
Towards Intelligent Exergames
Date: November 20, 2017
Abstract: Exergames – video games that require physical activity – hold promise of solving the societal hard problem of motivating people to move. At the same time, artificial intelligence and machine learning are transforming how video games are designed, produced, and tested. Work combining both computational intelligence and exergames is sparse, however. In my talk, I delineate the challenges, opportunities, and my group’s research towards intelligent exergames, building on our previous research on both exergame design (e.g., Augmented Climbing Wall, Kick Ass Kung-Fu) and intelligent control of embodied simulated agents. Video and examples: http://perttu.info
Speaker: Perttu Hämäläinen
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
Correlation-Compressed Direct Coupling Analysis
Date: November 27, 2017
Abstract: Direct Coupling Analysis (DCA) is a powerful tool to find pair-wise dependencies in large biological data sets. It amounts to inferring coefficients in a probabilistic model in an exponential family, and then using the largest such inferred coefficients as predictors for the dependencies of interest. The main computational bottle-neck is the inference. As described recently by Jukka Corander in this seminar series DCA has be done on bacterial whole-genome data, at the price of significant compute time, and investment in code optimization.
We have looked at if DCA can be speeded up by first filtering the data on correlations, an approach we call Correlation-Compressed Direct Coupling Analysis (CC-DCA). The computational bottle-neck then moves from DCA to the more standard task of finding a subset of most strongly correlated vectors in large data sets. I will describe results obtained so far, and outline what it would take to do CC-DCA on whole-genome data in human and other higher organisms.
This is joint work with Chen-Yi Gao and Hai-Jun Zhou, available as arXiv:1710.04819.
Speaker: Erik Aurell
Affiliation: Professor of Biological Physics, KTH-Royal Institute of Technology
Place of Seminar: University of Helsinki
Eco-friendly Bayes: Reusing Computationally Costly Posteriors in Post-hoc Analyses
Date: December 18, 2017
Abstract: Bayesian inference has many attractive features, but a major challenge is its potentially very high computational cost. While sampling from the prior distribution is often straightforward, the most expensive part is typically conditioning on the data. In many problems, a single data set data may not be informative enough to enable reliable inference for a given quantity of interest. This can be difficult to assess in advance and may require a considerable amount of computation to discover, resulting in a weakly informative posterior distribution “gone to waste”. On the other hand, borrowing strength across multiple related data sets using a hierarchical model may for very costly models be computationally infeasible.
As an alternative approach to traditional hierarchical models, we develop in this work a framework which reuses and combines posterior distributions computed on individual data sets to achieve post-hoc borrowing of strength, without the need to re-do expensive computations on the data. As a by-product, we also obtain a notion of meta-analysis for posterior distributions. By adopting the view that posterior distributions are beliefs which reflect the uncertainty about the value of some quantity, we formulate our approach as Bayesian inference with uncertain observations. We further show that this formulation is closely related to belief propagation. Finally, we illustrate the framework with post-hoc analyses of likelihood-free Bayesian inferences.
Speaker: Paul Blomstedt
Affiliation: Department of Computer Science, Aalto University
Place of Seminar: Aalto University
Scalable Algorithms for Extreme Multi-Class and Multi-Label Classificiation in Big Data
Date: January 15, 2018
Abstract: In the era of big data, large-scale classification involving tens of thousand target categories is not uncommon. Also referred to as Extreme Classification, it has also been recently shown that the machine learning challenges arising in ranking, recommendation systems and web-advertising can be effectively addressed by reducing it to extreme multi-label classification framework. In this talk, I will discuss my two recent works, and present TerseSVM and DiSMEC algorithms for extreme multi-class and multi-label classification. The precision@k and nDCG@k results using DiSMEC improve by upto 20% on benchmark datasets over state-of-the-art methods, which are used by Microsoft in production system of Bing Search. The training process for these algorithms makes use of openMP based distributed architectures, and is able to leverage thousands of cores for computation.
Speaker: Rohit Babbar
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
Confident Bayesian Learning of Graphical Models
Date: January 22, 2018
Abstract: Confident Bayesian learning amounts to computing summaries of a
posterior distribution either exactly or with probabilistic accuracy guarantees. I will review the state of the art in confident Bayesian structure learning in graphical models, focusing on the class of Bayesian networks and its subclass of chordal Markov networks.
Speaker: Mikko Koivisto
Affiliation: Professor of Computer Science, University of Helsinki
Place of Seminar: University of Helsinki
Special Seminar
Date: January 29 – February 2, 2018
This week we will have multiple machine learning talks at Aalto. We cancel the normal Monday session so that we can meet in them according to the following plan:
Jan 29, Glowacka Dorota
- Talk at 13:00 at TUAS 1171-72
- Title: Machine Learning Meets The User: Empirical Parametrization of Exploratory Search Systems
Jan 30, Heinonen Markus
- Talk at 15:00 at TUAS 1021
- Title: Towards Next-Generation Representational Learning
Feb 1, Marttinen Pekka
- Talk at 16:00 in Seminar Room T3 (CS building)
- Title: Machine Learning for Health
Feb 2, Solin Arno
- Talk at 13:00 at TUAS 1171-72
- Title: Machine Learning for Sensor Fusion in Positioning and Navigation
NOTE: The TUAS building address is Maarintie 8 (next to the CS building)
Studying mutational processes in cancer
Date: February 5, 2018
Abstract: Somatic mutations in cancer have accumulated during its evolution and are caused by different exposures to carcinogens and therapeutic agents, as well as, intrinsic errors that occur during DNA replication. Analysing a set of cancer samples jointly allows to explain their somatic mutations as a linear combination of (to be learned) mutational signatures. In this presentation I will discuss the problem of learning mutational signatures from cancer data using probabilistic modelling and nonnegative matrix factorisation. I further describe our ongoing work using mutational signatures in the context of drug response prediction and extensions of the basic model to explicitly include DNA repair processes.
Speaker: Ville Mustonen
Affiliation: Professor of Mathematics and Natural Science, University of Helsinki
Place of Seminar: University of Helsinki
Here Be Dragons: High-Dimensional Spaces and Statistical Computation
Date: February 12, 2018
Abstract: With consistently growing data sets and increasingly complex models, the frontiers of applied statistics is found in high-dimensional spaces. Unfortunately most of the intuitions that we take for granted in our low-dimensional, routine experiences don’t persist to these high-dimensional spaces which makes the development of scalable computational methodologies and algorithms all the more challenging. In this talk I will discuss the counter-intuitive behavior of high-dimensional spaces and the consequences for statistical computation.
Speaker: Michael Betancourt
His research focuses on the development of robust statistical workflows, computational tools, and pedagogical resources that bridge statistical theory and practice and enable scientists to make the most out of their data. The pursuit of general but scalable statistical computation has lead him to the intersection of differential geometry and probability theory where exploiting the inherent geometry of high-dimensional problems naturally leads to algorithms such Hamiltonian Monte Carlo and its generalizations. He is developing both the theoretical foundations and the practical implementations of these algorithms, the latter specifically in the software ecosystem Stan.
Affiliation: Columbia University
Place of Seminar: Aalto University
Learning and Stochastic Control in Gaussian Process Driven Physical Systems
Date: February 19, 2018
Abstract: Traditional machine learning is often overemphasising problems, where we wish to automatically learn everything from the problem at hand solely using a set of training data. However, in many physical systems we already know much about the physics, typically in form of partial differential equations. For efficient learning in this kind of systems, it is beneficial to use gray-box models where only the unknown parts are modeled with data-trained machine learning models. This talk is concerned with learning and stochastic control in physical systems which contain unknown input or force signals that we wish to learn from data. These unknown signals are modeled using Gaussian processes (GP) from machine learning. The resulting latent force models (LFMs) can be seen as hybrid models that contain a first-principles physical model part and a non-parametric GP model part. We present and discuss methods for learning and stochastic control in this kind of models.
Speaker: Simo Särkkä
Simo Särkkä is an Associate Professor and Academy Research Fellow with Aalto University, Technical Advisor of IndoorAtlas Ltd., and an Adjunct Professor with Tampere University of Technology and Lappeenranta University of Technology. His research interests are in multi-sensor data processing systems with applications in artificial intelligence, machine learning, inverse problems, location sensing, health technology, and brain imaging. He has authored or coauthored around 100 peer-reviewed scientific articles and his book “Bayesian Filtering and Smoothing” along with its Chinese translation were published via the Cambridge University Press in 2013 and 2015, respectively. He is a Senior Member of IEEE and serving as an Associate Editor of IEEE Signal Processing Letters.
Affiliation: Professor of Electrical Engineering and Automation, Aalto University
Place of Seminar: University of Helsinki
Bayesian Deep Learning for Image Data
Date: February 26, 2018
Abstract: Deep learning is the paradigm that lies at the heart of state-of-the-art machine learning approaches. Despite their groundbreaking success on a wide range of applications, deep neural nets suffer from: i) being severely prone to overfitting, ii) requiring intensive handcrafting in topology design, iii) being agnostic to model uncertainty, iv) and demanding large volumes of labeled data. The Bayesian approach provides principled solutions to all of these problems. Bayesian deep learning converts the loss minimization problem of conventional neural nets into a posterior inference problem by assigning prior distributions on synaptic weights. This talk will provide a recap of recent advances in Bayesian neural net inference and detail my contributions to the solution of this problem. I will demonstrate how Bayesian neural nets can achieve groundbreaking performance in weakly-supervised learning, active learning, few-shot learning, and transfer learning setups when applied to medical image analysis and core computer vision tasks. I will conclude by a summary of my ongoing research in reinforcement active learning, video-based imitation learning, and reconciliation of Bayesian Program Learning with Generative Adversarial Nets.
Speaker: Melih Kandemir
Dr. Kandemir studied computer science in Hacettepe University and Bilkent University between 2001 and 2008. Later on, he pursued his doctoral studies in Aalto University (former Helsinki University of Technology) on the development of machine learning models for mental state inference until 2013. He worked as a postdoctoral researcher in Heidelberg University, Heidelberg Collaboratory for Image Processing (HCI) between 2013 and 2016. As of 2017, he is an assistant professor at Özyeğin University, Computer Science Department. Throughout his career, he took part in various research projects in funded collaboration with multinational corporations including Nokia, Robert Bosch GmbH, and Carl Zeiss AG. Bayesian deep learning, few-shot learning, active learning, reinforcement learning, and application of these approaches to computer vision are among his research interests.
Affiliation: Professor of Computer Science, Özyeğin University
Place of Seminar: Aalto University
Artificial Intelligence for Mobility Studies in Urban And Natural Areas
Date: March 5, 2018
Abstract: Understanding how people use and move in space is important for planning, both in urban and natural areas. Recent research has shown that location-based social media data may reveal spatial and temporal patterns of the use of space, and reveal areas where human activities might be detrimental. We have shown that social media data corresponds to real-life spatial and temporal patterns of visitors in national parks and is able to bring light to use of space in cities, by providing meaningful information about the activities and preferences of people. The overwhelming magnitudes of social media data require special filtering and cleaning and tested analyses approaches. We are now using geospatial analysis methods together with machine learning to understand where, when, how and by whom areas are being used and how people and goods move about and why. Automated text and image content analysis is needed to leverage the full potential of social media data in spatial planning. Also new applications are yet to be discovered.
Speaker: Tuuli Toivonen
Affiliation: Professor of Geoinformatics, University of Helsinki
Place of Seminar: University of Helsinki
Finding Outlier Correlations
Date: March 12, 2018
Abstract: Finding strongly correlated pairs of observables is one of the basic tasks in data analysis and machine learning. Assuming we have N observables, there are N(N-1)/2 pairs of distinct observables, which gives rise to quadratic scalability in N if our approach is to explicitly compute all pairwise correlations.
In this talk, we look at algorithm designs that achieve subquadratic scalability in N to find pairs of observables that are strongly correlated compared with the majority of the pairs. Our plan is to start with an exposition of G. Valiant’s breakthrough design [FOCS’12,JACM’15] and then look at subsequent improved designs, including some of our own work.
Based on joint work with M. Karppa, J. Kohonen, and P. Ó Catháin, cf. https://arxiv.org/abs/1510.03895 (ACM TALG, to appear) and https://arxiv.org/abs/1606.05608 (ESA’16).
Speaker: Petteri Kaski
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
Fun With Relative Distance Comparisons
Date: March 19, 2018
Abstract: The distance between two data points is a fundamental ingredient of many data analysis methods. In this talk I review some of my work on “human computation” algorithms that use relative distance comparisons only. I.e., statements of the form “of items a, b, and c, item c is an outlier”, or “item a is closer to item b than to item c”. Such statements are easier to elicit from human annotators than absolute judgements of distance. I consider the problems of centroid computation (Heikinheimo & Ukkonen, HCOMP 2013), density estimation (Ukkonen et al, HCOMP 2015), embeddings (Amid & Ukkonen, ICML 2015), clustering (Ukkonen, ICDM 2017), as well as give a few sneak previews of ongoing work.
Speaker: Antti Ukkonen
Antti Ukkonen is an Academy research fellow at University of Helsinki. He obtained his doctoral degree at Aalto university in 2008, and has since held positions at Yahoo! Research, Helsinki Institute for Information Technology HIIT, and Finnish Institute of Occupational Health. Currently he is the PI in the “Data Science for the Masses” project funded by Academy of Finland. His research interests include algorithmic aspects of (distributed) human computation and machine learning, as well as applied data science.
Affiliation: Professor of Computer Science, University of Helsinki
Place of Seminar: University of Helsinki
Minisymposium on Agile Probabilistic AI
Date: March 26, 2018
Organizers: Aki Vehtari and Arto Klami
Talks |
Probabilistic Programming and Stan
Abstract: Probabilistic programming (PP) makes it easy to write new probabilistic models and PP frameworks then allow automated inference for those models. I describe the generic idea of PP, give some examples of software frameworks designed for different PP purposes, and focus more on recent development in Stan. Speaker: Aki Vehtari |
Automated Variational Inference
Abstract: Automated inference for generic models which can be programmed with probabilistic programming languages is challenging. I describe methods based on modern variational inference which are used to speed-up inference for big data. Speaker: Arto Klami |
Flash Talks |
ELFI – Engine for Likelihood Free Inference
Abstract: ELFI – Engine for Likelihood Free Inference provides a probabilistic programming framework for combining probabilistic models with stochastic simulators and performs automated inference using efficient likelihood free inference algorithms. Speaker: Henri Vuollekoski |
Place of Seminar: Aalto University
Minisymposium on Simulator-Based Inference
Date: April 9, 2018
Organizers: Samuel Kaski and Jaakko Lehtinen
Talks |
Inferring Cognitive Simulators From Data
Abstract: I will discuss the intriguing problem of inferring parameters of cognitive simulator models from user interaction data. The parameters can include the goals, interests and capabilities of the user, which become disentangled in the inference. Speaker: Samuel Kaski |
Negative Frequency-Dependent Selection Dictates the Fate of Bacteria
Abstract: In recent work we discovered that negative frequency-dependent selection (NFDS) acting on accessory genes appears as the dominating evolutionary force of populations of Streptococcus pneumoniae, which is a major human pathogen. This discovery was greatly facilitated by the recent advances in ABC inference brought by Bayesian optimization which has accelerated model fitting by several orders of magnitude. I will discuss the biological basis of the NFDS principle and show emerging evidence that it is commonly dictating the fate of bacteria also in ubiquitous organisms such as Escherichia coli. Speaker: Jukka Corander |
Why Learn Something You Know?
Abstract: Much of science can be seen as a search of mathematical models that predict observable quantities based on other observable quantities: future positions of stars based on their past positions, incidence of cancer based on biological health measurements, etc. Unlike handcrafted models derived from first principles employed in e.g. physics, many modern machine learning and AI techniques approach the same issue from a different perspective: fixing a highly powerful but general (not problem-specific) model, and setting its parameters based on data. Results are dramatic but often uninterpretable – we do not know precisely how the model comes to its conclusions. In this overview talk, I’ll present thoughts on combining the two approaches, along with recent exciting examples. Speaker: Jaakko Lehtinen |
Flash Talks |
Simulator-Based Inference in Robotics
Speaker: Ville Kyrki |
Optimizing Technologies with Bayesian Inference
Speaker: Milica Todorovic |
ABC and Model Selection
Speaker: Henri Pesonen |
Place of Seminar: University of Helsinki
NOTE: The seminar will be in Chemicum A129, Kumpula
Machine Learning using Unreliable Components: From Matrix Operations to Neural Networks and Stochastic Gradient Descent
Date: April 16, 2018
Abstract: Reliable computation at scale is one key challenge in large-scale machine learning today.Unreliability in computation can manifest itself in many forms, e.g. (i) “straggling” of a few slow processing nodes which can delay your entire computation, e.g., in synchronous gradient descent; (ii) processor failures; (iii) “soft-errors,” which are undetected errors where nodes can produce garbage outputs. My focus is on the problem of training using unreliable nodes.
First, I will introduce the problem of training model parallel neural networks in the presence of soft-errors. This problem was in fact the motivation of von Neumann’s 1956 study, which started the field of computing using unreliable components. We propose “CodeNet”, a unified, error-correction coding-based strategy that is weaved into the linear algebraic operations of neural network training to provide resilience to errors in every operation during training. I will also survey some of the notable results in the emerging area of “coded computing,” including my own work on matrix-vector and matrix-matrix products, that outperform classical results in fault-tolerant computing by arbitrarily large factors in expected time. Next, I will discuss the error-runtime trade-offs of various data parallel approaches in training machine learning models in presence of stragglers, in particular, synchronous and asynchronous variants of SGD. Finally, I will discuss some open problems in this exciting and interdisciplinary area.
Parts of this work is accepted at AISTATS 2018 and ISIT 2018.
Speaker: Sanghamitra Dutta
Affiliation: PhD Candidate, Carnegie Mellon University
Place of Seminar: Aalto University
Deep Learning Spectroscopy: Neural Networks for Molecular Excitation Spectra
Date: April 23, 2018
Abstract: For the study of molecules and materials, conventional theoretical and experimental spectroscopies are well established in the natural sciences, but they are slow and expensive. Our objective is to launch a new era of artificial intelligence (AI) enhanced spectroscopy that learns from the plethora of already available experimental and theoretical spectroscopy data. Once trained, the AI can make predictions of spectra instantly and at no further cost. In this new paradigm, AI spectroscopy would complement conventional theoretical and experimental spectroscopy to greatly accelerate the spectroscopic analysis of materials, make predictions for novel and hitherto uncharacterized materials, and discover entirely new materials.
In this presentation, I will introduce the two approaches we have used to learn spectroscopic properties: kernel ridge regression (KRR) and deep neural networks (NN). The models are trained and validated on data generated by accurate state-of-the art quantum chemistry computations for diverse subsets of the GBD-13 and GBD-17 molecular datasets [1,2]. The molecules are represented by a simple, easily attainable numerical description based on nuclear charges and cartesian coordinates [3,4]. The complexity of the molecular descriptor [4] turns out to be crucial for the learning success, as I will demonstrate for KRR. I will then show, how we can learn spectra (i.e. continuous target quantities) with NNs. We design and test three different NN architectures: multilayer perceptron (MLP) [5], convolutional neural network (CNN) and deep tensor neural network (DTNN) [6]. Already the MLP is able to learn spectra, but the learning quality improves significantly for the CNN and reaches its best performance for the DTNN. Both CNN and DTNN capture even small nuances in the spectral shape.
* This work was performed in collaboration with A. Stuke, K. Ghosh, L. Himanen, M. Todorovic, and A. Vehtari
[1] L. C. Blum et al., J. Am. Chem. Soc. 131, 8732 (2009)
[2] R. Ramakrishnan et al., Scientific Data 1, 140022 (2014)
[3] M. Rupp et al., Phys. Rev. Lett. 108, 058301 (2012)
[4] H. Huo and M. Rupp, arXiv:1704.06439
[5] G. Montavon et al., New J. Phys. 15, 095003 (2013)
[6] K. T. Schutt et al., Nat. Comm. 8, 13890 (2017)
Speaker: Patrick Rinke
Affiliation: Professor of Physics, Aalto University
Place of Seminar: University of Helsinki
Fairness-aware machine intelligence: foundations and challenges
Date: May 7, 2018
Abstract: Algorithmic decision making is pervasive, the prices we pay, the news or movies we see, the jobs or credits we get are advised by algorithms. Not so long ago the public used to think that decision making by computers is inherently objective, but realization that models learned from data are not more objective than the data on which they have been trained is becoming common. Fairness-aware machine learning has emerged as a discipline about ten years ago with the main goal to correct algorithmically for potential biases towards sensitive groups of people. The talk will discuss the main challenges, existing solutions and current trends in this research area.
Speaker: Indre Zliobaite
Affiliation: University of Helsinki
Place of Seminar: University of Helsinki
Minisymposium on Interactive AI
Date: May 14, 2018
Organizers: Giulio Jacucci and Antti Oulasvirta
Talks |
Computational Rationality: Convergence of Machine Learning, Neuroscience, Cognitive Science, Robotics, and HCI
Speaker: Antti Oulasvirta |
Interactive Intent Modelling and Knowledge Elicitation
Speaker: Samuel Kaski |
Interactive Robot Learning
Speaker: Ville Kyrki |
Flash Talks |
Emotional Rationality
Speaker: Jussi Jokinen |
Place of Seminar: Aalto University
Learning Differential Equation Models
Date: May 21, 2018
Abstract: Mechanistic models for biochemical networks are often constructed in the form of nonlinear ordinary or stochastic differential equation systems. Inference of such dynamic models from experimental data has attracted lots of interest in systems biology field, but the inference task is generally considered to be challenging. In this talk I will describe various differential equation and machine learning models for dynamic molecular networks, including parametric and novel non-parametric alternatives, and describe how the model parameters as well as network structure can be efficiently inferred from experimental time-course data. I will demonstrate applications of these models in biological and non-biological contexts.
Speaker: Harri Lähdesmäki
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: University of Helsinki
Minisymposium on Privacy Preserving and Secure AI
Date: May 28, 2018
Organizers: N. Asokan and Antti Honkela
Talks |
Machine Learning in the Presence of Adversaries
Abstract: Machine Learning in the presence of adversaries AI, and machine learning in particular, is being applied in a wide range of problem domains. Security/privacy problems are no exception. Typically, effectiveness of ML applications is evaluated in terms of considerations like accuracy and cost. An equally important consideration is how adversaries can interfere with ML applications. Considering the adversary at different stages of a ML pipeline can help us understand different security and privacy problems of ML applications. |
Android Malware Classification: How to Deal With Extremely Sparse Features
Abstract: In this talk I provide some insights about the specifics of working with high-dimensional features from Android files. Since each package to be classified can be described based on strings extracted from its files, the overall feature size grows drastically with the size of training set. To deal with sparse feature sets, we experimented with various approaches including log-odds ratio, random projections, feature clustering and non-random matrix factorization. In this talk, I describe the framework for Android Malware Classification with a focus on the proposed dimensionality reduction approaches. |
Reducing False Positives in Intrusion Detection
Abstract: The F-Secure Rapid Detection and Response Service is an intrusion detection service provided by F-secure to companies. In this solution, we analyze the events generated by the clients, and raise an alarm when suspicious behavior occurs. These alarms are further analyzed by experts, and if needed, a client is contacted. Some of these alarms are false positives, resulting in unnecessary analysis work by the experts, and in this talk we describe the challenges, and the approach to reduce such false positives. |
Stealing DNN Models: Attacks and Defenses
Abstract: Today machine learning models constitute business advantages to several companies. Companies want to leverage ML models to provide prediction services to clients. However, direct (i.e. white-box) access to models has been shown to be vulnerable to adversarial machine learning, where a malicious client may craft adversarial examples — samples that that by design are misclassified by the model. This has serious consequences for several business sectors, including autonomous driving and malware detection. Model confidentiality is paramount in these scenarios. Consequently, model owners do not want to reveal the model to the client, but may provide black-box predictions via well-defined APIs to them. Nevertheless, prediction APIs still leak information (predictions) that make it possible to mount model extraction attacks by repeatedly querying the model via the prediction API. Model extraction attacks threaten the confidentiality of the model, as well as the integrity, since the stolen model can be used to create transferable adversarial examples. We analyze model extraction attacks on DNNs via structured tests and present a new way of generating synthetic queries, which outperforms state-of-the-art. We then propose a generic approach to effectively detect model extraction attacks: PRADA. It analyzes the distribution of successive queries to the model evolves and detects abrupt deviations. We show that PRADA can detect all known model extraction attacks with a 100% success rate and no false positives. Speaker: Mika Juuti |
Differential Privacy and Machine Learning
Abstract: Differential privacy provides a flexible framework for privacy-aware computation. It provides strong privacy guarantees through requiring that the results of the computation should not depend too strongly on any single individual’s data. In my talk I will introduce differentially private machine learning, with an emphasis on Bayesian methods. I will also present an application of differentially private machine learning to personalised cancer drug sensitivity prediction using gene expression data. |
Differentially private Bayesian learning on distributed data
Abstract: Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness, or add prohibitive amounts of noise to learning. I discuss a novel method for DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. The method relies on secure multi-party computation combined with the well-established Gaussian mechanism for DP. The talk is based on our recent paper. |
Privacy Preservation with Federated Learning in Personalized Recommendation Systems
Abstract: Recent events have brought public attention to how companies capture, store and exploit user’s personal data in their various services. The EU’s GDPR enforcement starts in May 2018 regulating how companies access, store and process user data. Companies can now suffer reputational damage and large financial penalties if they fail to respect the rights of users and how they manage their data. At Huawei we are looking at different approaches to enhancing Huawei user privacy while at the same time providing an optimal user experience. In this talk we discuss one approach to generating personalized recommendations for use in different Huawei mobile services based on Federated Learning. The target of the research has been to generate high quality personalized recommendations on mobile devices without moving the user data from the user’s own device. |
Place of Seminar: Aalto University
The Power of Gaussian Processes: Magnetic Localisation and Mapping
Date: September 3, 2018
Abstract: Gaussian processes (GPs) are convenient tools for model-building and inference. This talk goes through how to encode knowledge from high-school physics into a GP model for the ambient magnetic field (observed by a smartphone compass). Small disturbances in the magnetic field are then used in simultaneous localisation and mapping (SLAM) to simultaneously build a map of the magnetic field and localise the device on it by Rao-Blackwellised particle filtering (Sequential Monte Carlo). The paper presenting this setup recently won the Best Paper Award at the International Conference on Information Fusion 2018.
Speaker: Arno Solin
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Date: September 17, 2018
Abstract: We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset.
Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen, ICLR 2018
Speaker: Tero Karras
Affiliation: Research Scientist, NVIDIA
Place of Seminar: Aalto University