Machine Learning Coffee Seminars

Spring 20182018-11-21T13:14:25+00:00

History of Previous Talks in Spring 2018

Minisymposium on Privacy Preserving and Secure AI

Date: May 28, 2018

Organizers: N. Asokan and Antti Honkela

Talks
Machine Learning in the Presence of Adversaries

Abstract: Machine Learning in the presence of adversaries AI, and machine learning in particular, is being applied in a wide range of problem domains. Security/privacy problems are no exception. Typically, effectiveness of ML applications is evaluated in terms of considerations like accuracy and cost. An equally important consideration is how adversaries can interfere with ML applications. Considering the adversary at different stages of a ML pipeline can help us understand different security and privacy problems of ML applications.
Speaker: N. Asokan

Slides

Android Malware Classification: How to Deal With Extremely Sparse Features

Abstract: In this talk I provide some insights about the specifics of working with high-dimensional features from Android files. Since each package to be classified can be described based on strings extracted from its files, the overall feature size grows drastically with the size of training set. To deal with sparse feature sets, we experimented with various approaches including log-odds ratio, random projections, feature clustering and non-random matrix factorization. In this talk, I describe the framework for Android Malware Classification with a focus on the proposed dimensionality reduction approaches.
Speaker: Luiza Sayfullina

Slides

Reducing False Positives in Intrusion Detection

Abstract: The F-Secure Rapid Detection and Response Service is an intrusion detection service provided by F-secure to companies. In this solution, we analyze the events generated by the clients, and raise an alarm when suspicious behavior occurs. These alarms are further analyzed by experts, and if needed, a client is contacted. Some of these alarms are false positives, resulting in unnecessary analysis work by the experts, and in this talk we describe the challenges, and the approach to reduce such false positives.
Speaker: Nikolaj Tatti

Slides

Stealing DNN Models: Attacks and Defenses

Abstract: Today machine learning models constitute business advantages to several companies. Companies want to leverage ML models to provide prediction services to clients. However, direct (i.e. white-box) access to models has been shown to be vulnerable to adversarial machine learning, where a malicious client may craft adversarial examples — samples that that by design are misclassified by the model. This has serious consequences for several business sectors, including autonomous driving and malware detection. Model confidentiality is paramount in these scenarios.

Consequently, model owners do not want to reveal the model to the client, but may provide black-box predictions via well-defined APIs to them. Nevertheless, prediction APIs still leak information (predictions) that make it possible to mount model extraction attacks by repeatedly querying the model via the prediction API. Model extraction attacks threaten the confidentiality of the model, as well as the integrity, since the stolen model can be used to create transferable adversarial examples.

We analyze model extraction attacks on DNNs via structured tests and present a new way of generating synthetic queries, which outperforms state-of-the-art. We then propose a generic approach to effectively detect model extraction attacks: PRADA. It analyzes the distribution of successive queries to the model evolves and detects abrupt deviations. We show that PRADA can detect all known model extraction attacks with a 100% success rate and no false positives.

Speaker: Mika Juuti

Slides

Differential Privacy and Machine Learning

Abstract: Differential privacy provides a flexible framework for privacy-aware computation. It provides strong privacy guarantees through requiring that the results of the computation should not depend too strongly on any single individual’s data. In my talk I will introduce differentially private machine learning, with an emphasis on Bayesian methods. I will also present an application of differentially private machine learning to personalised cancer drug sensitivity prediction using gene expression data.
Speaker: Antti Honkela

Differentially private Bayesian learning on distributed data

Abstract: Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness, or add prohibitive amounts of noise to learning. I discuss a novel method for DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. The method relies on secure multi-party computation combined with the well-established Gaussian mechanism for DP. The talk is based on our recent paper.
Speaker: Mikko Heikkilä

Privacy Preservation with Federated Learning in Personalized Recommendation Systems

Abstract: Recent events have brought public attention to how companies capture, store and exploit user’s personal data in their various services.

The EU’s GDPR enforcement starts in May 2018 regulating how companies access, store and process user data. Companies can now suffer reputational damage and large financial penalties if they fail to respect the rights of users and how they manage their data. At Huawei we are looking at different approaches to enhancing Huawei user privacy while at the same time providing an optimal user experience. In this talk we discuss one approach to generating personalized recommendations for use in different Huawei mobile services based on Federated Learning. The target of the research has been to generate high quality personalized recommendations on mobile devices without moving the user data from the user’s own device.
Speaker: Adrian Flanagan

Place of Seminar: Aalto University

Learning Differential Equation Models

Date: May 21, 2018
Abstract: Mechanistic models for biochemical networks are often constructed in the form of nonlinear ordinary or stochastic differential equation systems. Inference of such dynamic models from experimental data has attracted lots of interest in systems biology field, but the inference task is generally considered to be challenging. In this talk I will describe various differential equation and machine learning models for dynamic molecular networks, including parametric and novel non-parametric alternatives, and describe how the model parameters as well as network structure can be efficiently inferred from experimental time-course data. I will demonstrate applications of these models in biological and non-biological contexts.

Speaker: Harri Lähdesmäki

Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: University of Helsinki

Minisymposium on Interactive AI

Date: May 14, 2018

Organizers: Giulio Jacucci and Antti Oulasvirta

Talks
Computational Rationality: Convergence of Machine Learning, Neuroscience, Cognitive Science, Robotics, and HCI

Speaker: Antti Oulasvirta

Interactive Intent Modelling and Knowledge Elicitation

Speaker: Samuel Kaski

Interactive Robot Learning

Speaker: Ville Kyrki

Flash Talks
Emotional Rationality

Speaker: Jussi Jokinen

Place of Seminar: Aalto University

Fairness-aware machine intelligence: foundations and challenges

Date: May 7, 2018

Abstract: Algorithmic decision making is pervasive, the prices we pay, the news or movies we see, the jobs or credits we get are advised by algorithms. Not so long ago the public used to think that decision making by computers is inherently objective, but realization that models learned from data are not more objective than the data on which they have been trained is becoming common. Fairness-aware machine learning has emerged as a discipline about ten years ago with the main goal to correct algorithmically for potential biases towards sensitive groups of people. The talk will discuss the main challenges, existing solutions and current trends in this research area.

Speaker: Indre Zliobaite

Affiliation: University of Helsinki

Place of Seminar: University of Helsinki

Deep Learning Spectroscopy: Neural Networks for Molecular Excitation Spectra

Date: April 23, 2018

Abstract: For the study of molecules and materials, conventional theoretical and experimental spectroscopies are well established in the natural sciences, but they are slow and expensive. Our objective is to launch a new era of artificial intelligence (AI) enhanced spectroscopy that learns from the plethora of already available experimental and theoretical spectroscopy data. Once trained, the AI can make predictions of spectra instantly and at no further cost. In this new paradigm, AI spectroscopy would complement conventional theoretical and experimental spectroscopy to greatly accelerate the spectroscopic analysis of materials, make predictions for novel and hitherto uncharacterized materials, and discover entirely new materials.

In this presentation, I will introduce the two approaches we have used to learn spectroscopic properties: kernel ridge regression (KRR) and deep neural networks (NN). The models are trained and validated on data generated by accurate state-of-the art quantum chemistry computations for diverse subsets of the GBD-13 and GBD-17 molecular datasets [1,2]. The molecules are represented by a simple, easily attainable numerical description based on nuclear charges and cartesian coordinates [3,4]. The complexity of the molecular descriptor [4] turns out to be crucial for the learning success, as I will demonstrate for KRR. I will then show, how we can learn spectra (i.e. continuous target quantities) with NNs. We design and test three different NN architectures: multilayer perceptron (MLP) [5], convolutional neural network (CNN) and deep tensor neural network (DTNN) [6]. Already the MLP is able to learn spectra, but the learning quality improves significantly for the CNN and reaches its best performance for the DTNN. Both CNN and DTNN capture even small nuances in the spectral shape.

* This work was performed in collaboration with A. Stuke, K. Ghosh, L. Himanen, M. Todorovic, and A. Vehtari

[1] L. C. Blum et al., J. Am. Chem. Soc. 131, 8732 (2009)

[2] R. Ramakrishnan et al., Scientific Data 1, 140022 (2014)

[3] M. Rupp et al., Phys. Rev. Lett. 108, 058301 (2012)

[4] H. Huo and M. Rupp, arXiv:1704.06439

[5] G. Montavon et al., New J. Phys. 15, 095003 (2013)

[6] K. T. Schutt et al., Nat. Comm. 8, 13890 (2017)

Speaker: Patrick Rinke

Affiliation: Professor of Physics, Aalto University

Place of Seminar: University of Helsinki

Machine Learning using Unreliable Components: From Matrix Operations to Neural Networks and Stochastic Gradient Descent

Date: April 16, 2018

Abstract: Reliable computation at scale is one key challenge in large-scale machine learning today.Unreliability in computation can manifest itself in many forms, e.g. (i) “straggling” of a few slow processing nodes which can delay your entire computation, e.g., in synchronous gradient descent; (ii) processor failures; (iii) “soft-errors,” which are undetected errors where nodes can produce garbage outputs. My focus is on the problem of training using unreliable nodes.

First, I will introduce the problem of training model parallel neural networks in the presence of soft-errors. This problem was in fact the motivation of von Neumann’s 1956 study, which started the field of computing using unreliable components. We propose “CodeNet”, a unified, error-correction coding-based strategy that is weaved into the linear algebraic operations of neural network training to provide resilience to errors in every operation during training. I will also survey some of the notable results in the emerging area of “coded computing,” including my own work on matrix-vector and matrix-matrix products, that outperform classical results in fault-tolerant computing by arbitrarily large factors in expected time. Next, I will discuss the error-runtime trade-offs of various data parallel approaches in training machine learning models in presence of stragglers, in particular, synchronous and asynchronous variants of SGD. Finally, I will discuss some open problems in this exciting and interdisciplinary area.

Parts of this work is accepted at AISTATS 2018 and ISIT 2018.

Speaker: Sanghamitra Dutta

Affiliation: PhD Candidate, Carnegie Mellon University

Place of Seminar: Aalto University

Slides

Minisymposium on Simulator-Based Inference

Date: April 9, 2018

Organizers: Samuel Kaski and Jaakko Lehtinen

Talks
Inferring Cognitive Simulators From Data

Abstract: I will discuss the intriguing problem of inferring parameters of cognitive simulator models from user interaction data. The parameters can include the goals, interests and capabilities of the user, which become disentangled in the inference.

Speaker: Samuel Kaski

Negative Frequency-Dependent Selection Dictates the Fate of Bacteria

Abstract: In recent work we discovered that negative frequency-dependent selection (NFDS) acting on accessory genes appears as the dominating evolutionary force of populations of Streptococcus pneumoniae, which is a major human pathogen. This discovery was greatly facilitated by the recent advances in ABC inference brought by Bayesian optimization which has accelerated model fitting by several orders of magnitude. I will discuss the biological basis of the NFDS principle and show emerging evidence that it is commonly dictating the fate of bacteria also in ubiquitous organisms such as Escherichia coli.

Speaker: Jukka Corander

Why Learn Something You Know?

Abstract: Much of science can be seen as a search of mathematical models that predict observable quantities based on other observable quantities: future positions of stars based on their past positions, incidence of cancer based on biological health measurements, etc. Unlike handcrafted models derived from first principles employed in e.g. physics, many modern machine learning and AI techniques approach the same issue from a different perspective: fixing a highly powerful but general (not problem-specific) model, and setting its parameters based on data. Results are dramatic but often uninterpretable – we do not know precisely how the model comes to its conclusions. In this overview talk, I’ll present thoughts on combining the two approaches, along with recent exciting examples.

Speaker: Jaakko Lehtinen

Flash Talks
Simulator-Based Inference in Robotics

Speaker: Ville Kyrki

Optimizing Technologies with Bayesian Inference

Speaker: Milica Todorovic

ABC and Model Selection

Speaker: Henri Pesonen

Place of Seminar: University of Helsinki

NOTE: The seminar will be in Chemicum A129, Kumpula

Minisymposium on Agile Probabilistic AI

Date: March 26, 2018

Organizers: Aki Vehtari and Arto Klami

Talks
Probabilistic Programming and Stan

Abstract: Probabilistic programming (PP) makes it easy to write new probabilistic models and PP frameworks then allow automated inference for those models. I describe the generic idea of PP, give some examples of software frameworks designed for different PP purposes, and focus more on recent development in Stan.

Speaker: Aki Vehtari

Slides

Automated Variational Inference

Abstract: Automated inference for generic models which can be programmed with probabilistic programming languages is challenging. I describe methods based on modern variational inference which are used to speed-up inference for big data.

Speaker: Arto Klami

Slides

Flash Talks
ELFI – Engine for Likelihood Free Inference

Abstract: ELFI – Engine for Likelihood Free Inference provides a probabilistic programming framework for combining probabilistic models with stochastic simulators and performs automated inference using efficient likelihood free inference algorithms.

Speaker: Henri Vuollekoski

Place of Seminar: Aalto University

Fun With Relative Distance Comparisons

Date: March 19, 2018

Abstract: The distance between two data points is a fundamental ingredient of many data analysis methods. In this talk I review some of my work on “human computation” algorithms that use relative distance comparisons only. I.e., statements of the form “of items a, b, and c, item c is an outlier”, or “item a is closer to item b than to item c”. Such statements are easier to elicit from human annotators than absolute judgements of distance. I consider the problems of centroid computation (Heikinheimo & Ukkonen, HCOMP 2013), density estimation (Ukkonen et al, HCOMP 2015), embeddings (Amid & Ukkonen, ICML 2015), clustering (Ukkonen, ICDM 2017), as well as give a few sneak previews of ongoing work.

Speaker: Antti Ukkonen

Antti Ukkonen is an Academy research fellow at University of Helsinki. He obtained his doctoral degree at Aalto university in 2008, and has since held positions at Yahoo! Research, Helsinki Institute for Information Technology HIIT, and Finnish Institute of Occupational Health. Currently he is the PI in the “Data Science for the Masses” project funded by Academy of Finland. His research interests include algorithmic aspects of (distributed) human computation and machine learning, as well as applied data science.

Affiliation: Professor of Computer Science, University of Helsinki

Place of Seminar: University of Helsinki

Finding Outlier Correlations

Date: March 12, 2018

Abstract: Finding strongly correlated pairs of observables is one of the basic tasks in data analysis and machine learning. Assuming we have N observables, there are N(N-1)/2 pairs of distinct observables, which gives rise to quadratic scalability in N if our approach is to explicitly compute all pairwise correlations.

In this talk, we look at algorithm designs that achieve subquadratic scalability in N to find pairs of observables that are strongly correlated compared with the majority of the pairs. Our plan is to start with an exposition of G. Valiant’s breakthrough design [FOCS’12,JACM’15] and then look at subsequent improved designs, including some of our own work.

Based on joint work with M. Karppa, J. Kohonen, and P. Ó Catháin, cf. https://arxiv.org/abs/1510.03895 (ACM TALG, to appear) and https://arxiv.org/abs/1606.05608 (ESA’16).

Speaker: Petteri Kaski

Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: Aalto University

Artificial Intelligence for Mobility Studies in Urban And Natural Areas

Date: March 5, 2018

Abstract: Understanding how people use and move in space is important for planning, both in urban and natural areas. Recent research has shown that location-based social media data may reveal spatial and temporal patterns of the use of space, and reveal areas where human activities might be detrimental. We have shown that social media data corresponds to real-life spatial and temporal patterns of visitors in national parks and is able to bring light to use of space in cities, by providing meaningful information about the activities and preferences of people. The overwhelming magnitudes of social media data require special filtering and cleaning and tested analyses approaches. We are now using geospatial analysis methods together with machine learning to understand where, when, how and by whom areas are being used and how people and goods move about and why. Automated text and image content analysis is needed to leverage the full potential of social media data in spatial planning. Also new applications are yet to be discovered.

Speaker: Tuuli Toivonen

Affiliation: Professor of Geoinformatics, University of Helsinki

Place of Seminar: University of Helsinki

Slides

Bayesian Deep Learning for Image Data

Date: February 26, 2018

Abstract: Deep learning is the paradigm that lies at the heart of state-of-the-art machine learning approaches. Despite their groundbreaking success on a wide range of applications, deep neural nets suffer from: i) being severely prone to overfitting, ii) requiring intensive handcrafting in topology design, iii) being agnostic to model uncertainty, iv) and demanding large volumes of labeled data. The Bayesian approach provides principled solutions to all of these problems. Bayesian deep learning converts the loss minimization problem of conventional neural nets into a posterior inference problem by assigning prior distributions on synaptic weights. This talk will provide a recap of recent advances in Bayesian neural net inference and detail my contributions to the solution of this problem. I will demonstrate how Bayesian neural nets can achieve groundbreaking performance in weakly-supervised learning, active learning, few-shot learning, and transfer learning setups when applied to medical image analysis and core computer vision tasks. I will conclude by a summary of my ongoing research in reinforcement active learning, video-based imitation learning, and reconciliation of Bayesian Program Learning with Generative Adversarial Nets.

Speaker: Melih Kandemir

Dr. Kandemir studied computer science in Hacettepe University and Bilkent University between 2001 and 2008. Later on, he pursued his doctoral studies in Aalto University (former Helsinki University of Technology) on the development of machine learning models for mental state inference until 2013. He worked as a postdoctoral researcher in Heidelberg University, Heidelberg Collaboratory for Image Processing (HCI) between 2013 and 2016. As of 2017, he is an assistant professor at Özyeğin University, Computer Science Department. Throughout his career, he took part in various research projects in funded collaboration with multinational corporations including Nokia, Robert Bosch GmbH, and Carl Zeiss AG. Bayesian deep learning, few-shot learning, active learning, reinforcement learning, and application of these approaches to computer vision are among his research interests.

Affiliation: Professor of Computer Science, Özyeğin University

Place of Seminar: Aalto University

Slides

Learning and Stochastic Control in Gaussian Process Driven Physical Systems

Date: February 19, 2018

Abstract: Traditional machine learning is often overemphasising problems, where we wish to automatically learn everything from the problem at hand solely using a set of training data. However, in many physical systems we already know much about the physics, typically in form of partial differential equations. For efficient learning in this kind of systems, it is beneficial to use gray-box models where only the unknown parts are modeled with data-trained machine learning models. This talk is concerned with learning and stochastic control in physical systems which contain unknown input or force signals that we wish to learn from data. These unknown signals are modeled using Gaussian processes (GP) from machine learning. The resulting latent force models (LFMs) can be seen as hybrid models that contain a first-principles physical model part and a non-parametric GP model part. We present and discuss methods for learning and stochastic control in this kind of models.

Speaker: Simo Särkkä

Simo Särkkä is an Associate Professor and Academy Research Fellow with Aalto University, Technical Advisor of IndoorAtlas Ltd., and an Adjunct Professor with Tampere University of Technology and Lappeenranta University of Technology. His research interests are in multi-sensor data processing systems with applications in artificial intelligence, machine learning, inverse problems, location sensing, health technology, and brain imaging. He has authored or coauthored around 100 peer-reviewed scientific articles and his book “Bayesian Filtering and Smoothing” along with its Chinese translation were published via the Cambridge University Press in 2013 and 2015, respectively. He is a Senior Member of IEEE and serving as an Associate Editor of IEEE Signal Processing Letters.

Affiliation: Professor of Electrical Engineering and Automation, Aalto University

Place of Seminar: University of Helsinki

Here Be Dragons: High-Dimensional Spaces and Statistical Computation

Date: February 12, 2018

Abstract: With consistently growing data sets and increasingly complex models, the frontiers of applied statistics is found in high-dimensional spaces. Unfortunately most of the intuitions that we take for granted in our low-dimensional, routine experiences don’t persist to these high-dimensional spaces which makes the development of scalable computational methodologies and algorithms all the more challenging. In this talk I will discuss the counter-intuitive behavior of high-dimensional spaces and the consequences for statistical computation.

Speaker: Michael Betancourt

His research focuses on the development of robust statistical workflows, computational tools, and pedagogical resources that bridge statistical theory and practice and enable scientists to make the most out of their data. The pursuit of general but scalable statistical computation has lead him to the intersection of differential geometry and probability theory where exploiting the inherent geometry of high-dimensional problems naturally leads to algorithms such Hamiltonian Monte Carlo and its generalizations. He is developing both the theoretical foundations and the practical implementations of these algorithms, the latter specifically in the software ecosystem Stan.

Affiliation: Columbia University

Place of Seminar: Aalto University

Slides

Studying mutational processes in cancer

Date: February 5, 2018

Abstract: Somatic mutations in cancer have accumulated during its evolution and are caused by different exposures to carcinogens and therapeutic agents, as well as, intrinsic errors that occur during DNA replication. Analysing a set of cancer samples jointly allows to explain their somatic mutations as a linear combination of (to be learned) mutational signatures. In this presentation I will discuss the problem of learning mutational signatures from cancer data using probabilistic modelling and nonnegative matrix factorisation. I further describe our ongoing work using mutational signatures in the context of drug response prediction and extensions of the basic model to explicitly include DNA repair processes.

Speaker: Ville Mustonen

Affiliation: Professor of Mathematics and Natural Science, University of Helsinki

Place of Seminar: University of Helsinki

Special Seminar

Date: January 29 – February 2, 2018

This week we will have multiple machine learning talks at Aalto. We cancel the normal Monday session so that we can meet in them according to the following plan:

Jan 29, Glowacka Dorota

  • Talk at 13:00 at TUAS 1171-72
  • Title: Machine Learning Meets The User: Empirical Parametrization of Exploratory Search Systems

Jan 30, Heinonen Markus

  • Talk at 15:00 at TUAS 1021
  • Title: Towards Next-Generation Representational Learning

Feb 1, Marttinen Pekka

  • Talk at 16:00 in Seminar Room T3 (CS building)
  • Title: Machine Learning for Health

Feb 2, Solin Arno

  • Talk at 13:00 at TUAS 1171-72
  • Title: Machine Learning for Sensor Fusion in Positioning and Navigation

NOTE: The TUAS building address is Maarintie 8 (next to the CS building)

Confident Bayesian Learning of Graphical Models

Date: January 22, 2018

Abstract: Confident Bayesian learning amounts to computing summaries of a
posterior distribution either exactly or with probabilistic accuracy guarantees. I will review the state of the art in confident Bayesian structure learning in graphical models, focusing on the class of Bayesian networks and its subclass of chordal Markov networks.

Speaker: Mikko Koivisto

Affiliation: Professor of Computer Science, University of Helsinki

Place of Seminar: University of Helsinki

Slides

Scalable Algorithms for Extreme Multi-Class and Multi-Label Classificiation in Big Data

Date: January 15, 2018

Abstract: In the era of big data, large-scale classification involving tens of thousand target categories is not uncommon. Also referred to as Extreme Classification, it has also been recently shown that the machine learning challenges arising in ranking, recommendation systems and web-advertising can be effectively addressed by reducing it to extreme multi-label classification framework. In this talk, I will discuss my two recent works, and present TerseSVM and DiSMEC algorithms for extreme multi-class and multi-label classification. The precision@k and nDCG@k results using DiSMEC improve by upto 20% on benchmark datasets over state-of-the-art methods, which are used by Microsoft in production system of Bing Search. The training process for these algorithms makes use of openMP based distributed architectures, and is able to leverage thousands of cores for computation.

Speaker: Rohit Babbar

Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: Aalto University