History of Previous Talks in Spring 2020

A Link between Coding Theory and Cross-Validation with Applications

Date: March 23, 2020

Abstract: We study the combinatorics of cross-validation based AUC estimation under the null hypothesis that the binary class labels are exchangeable, that is, the data are randomly assigned into two classes. In particular, we study how the estimators based on leave-pair-out cross-validation (LPOCV), in which every possible pair of data with different class labels is held out from the training set at a time, behave under the null without any prior assumptions of the learning algorithm or the data. It is observed that the maximal number of different assignments of w nonzero and n-w zero class labels on the data, for which any fixed learning algorithm can achieve zero LPOCV error, is equivalent with the maximal size of a constant weight error-correcting code of length n, weight w and Hamming distance four between code words. We then introduce the concept of a light constant weight code and show similar results for bounded LPOCV errors. These results enable the design of new LPOCV based statistical tests for the learning algorithms ability to distinguish two classes from each other that are analogous to the classical Wilcoxon-Mann-Whitney U test for fixed functions.

Speaker: Professor Tapio Pahikkala

Affiliation: Department of Future Technologies, University of Turku

Place of Seminar: Zoom

AI approaches to support language learning

Date: March 16, 2020 – CANCELLED

Abstract: The talk will present an overview of Revita — a project in CALL: computer-aided language learning. We work specifically on learning beyond the elementary level — for intermediate to advanced learners. (Beginners receive adequate support from the many existing services and applications.) Revita aims to simulate a good teacher, by modeling and assessing the learner’s state and progress. An important aspect of our approach is allowing the user to learn from authentic materials — arbitrary texts, chosen by the users themselves.

Revita is the first AI-based system of scale that:

  • works beyond the elementary level, targeting intermediate to advanced learners;
  • is multi-lingual (English-only systems exist for advanced essay assessment);
  • is used in official university-level curricula, beyond “academic” experiments.

We are working on expanding to additional learning environments, and to additional languages.

Revita collects data, which allows us to analyze patterns of language learning in great depth and detail. We collect data about the learning process from actual learners — about typical mistakes, paths of progress, etc. This data provides a playground for research problems, which we will discuss, focusing on two sides of the research:

  • tools needed to collect and analyze the data, and
  • methods of analysis and applications of the collected data, which include neural networks, in particular, sequence-to-sequence models.

Bio: Roman Yangarber is an Associate Professor at the Department of Digital Humanities (DIGIHUM), University of Helsinki. Prior to moving to DIGIHUM, he led the research group in natural language processing at the Department of Computer Science, over the last 10 years. The group has been working on a variety of themes in NLP, researching how language works and how computers can better understand language. Research themes include analysis of news media and modeling language evolution. The more recent research — AI support for language learning — has resulted in a system used by end-users at several universities, and has won best paper awards at a Digital Humanities conference last year.

Speaker: Professor Roman Yangarber

Affiliation: Department of Digital Humanities, Helsinki University

Place of Seminar: CANCELLED

Interactive AI and Machine Teaching of Active Sequential Learners

Date: March 09, 2020

Abstract: Interactive intelligent systems, such as recommendation systems, are commonly based on sequential machine learners which actively choose their queries (e.g., recommendations) and learn from the responses. How to steer, or teach, such a system towards a desired goal or state? An answer in the form of a computational model of the teacher provides an approach towards modelling active planning behaviour of users in human-computer interaction.

I will talk about our recent approach towards solving the problem through machine teaching. We formulate the sequential teaching problem as a Markov decision process and address the complementary problem of learning from a teacher through probabilistic inverse reinforcement learning. In conventional machine teaching settings, the teacher provides data that are consistent with the true data distribution. However, we find that in our more constrained setting, consistent teachers can be sub-optimal. Simulated experiments and a user study with multi-armed bandit learners demonstrate empirically the benefits of the approach.

Project website is available at https://aaltopml.github.io/machine-teaching-of-active-sequential-learners/.

Speaker: Dr. Tomi Peltola

Affiliation: Curious AI

Place of Seminar: Lecture Hall T6, Konemiehentie 2, Aalto University

Algorithmization of Counterfactuals and a Probabilistic Theory of Causality

Date: March 02, 2020

Abstract: This talk discusses a Bayesian network, a probability distribution factorized along on a diamond DAG (four nodes). At the highest layer we use noisy Boolean functions (of two variables). When the Fourier series of these noisy Boolean functions are used in the algorithm for counterfactuals (Pearl) we obtain easy explicit formulae for computing counterfactual probabilities.

Causality and causal inference have been/are of interest, e.g., for cognitive science, genetic epidemiology, philosophy and AI. Philosophers have discussed the general nature of (probabilistic ) causality. At the end of this talk one definition of a probabilistic cause is applied on the counterfactual probabilities of the diamond DAG.

Bio: Timo Koski comes from the Department of Mathematics, KTH Royal Institute of Technology, Stockholm, and currently visiting FCAI with Prof. Jukka Corander as host. His scientific interests are probability, genetics, causal inference and Bayesian networks.

Speaker: Professor Timo Koski

Affiliation: Department of Mathematics, KTH University

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Meta Reinforcement Learning for Sim-to-real adaptation

Date: February 24, 2020

Speaker: Professor Ville Kyrki

Affiliation: Department of Electrical Engineering and Automation, Aalto University

Place of Seminar: Lecture Hall T6, Konemiehentie 2, Aalto University

Intelligent service assistant for people in Finland

Date: February 17, 2020

Abstract: Historically, societal challenges have been handled as complicated ones by dividing responsibilities into various organizations that are then mandated to provide needed solutions and services from their own perspective. It has also defined the enterprise architecture that has had a tendency to mimic organizational structures and their responsibilities. Today, modern societies are facing complex challenges, such as climate change, that can’t be solved without a paradigm shift of problem-solving. It also challenges information systems capabilities to co-operate in a cross-sectoral manner. Advances in ICT and its adaptation and penetration rate in every part of society introduce new possibilities that can only be benefitted in full by providing more integrated information systems. Moreover, the level of citizens’ expectations has raised due to other systems that liberally connect data from different sources. To meet new goals, societal services should become more proactive and personalized, satisfying citizens’ needs as they emerge. In this paper, we propose using a holistic model of digital twin paradigm for societal applications. The proposal builds on using a citizen 360-data model that reflects the characteristics of citizens that act as service users. Based on the data model, societal information systems can propose actions and provide proactive services that are mass-tailored to meet individuals’ needs.

Bio: Aleksi Kopponen is a Special Advisor of Digitalization at Ministry of Finance in Finland, pioneering human-centric governance and cross-sectoral service ecosystem operating models in action. Aleksi’s responsibilities have included preparation and implementation of the principles of digitalization, implementation of The Playbook of Digitalization, and supervision of the digitalization supporting team D9 in State Treasury. Aleksi has also supported the preparation of the Government projects for digitalization of processes, in total of 100 M€ in between years 2015 and 2016. The Finnish Government has decided to introduce a new approach to strengthen human-centric management, digitalization and creation and development of ecosystems. In this new operational model, smart services are organised around people’s life events and business events in a cross-sectoral cooperation. Aleksi has had the main responsibility for preparation and implementation of this new approach. Since people and communities need different services at different times, Finland has introduced AuroraAI, an artificial intelligence network where information and service needs to move between different smart applications in a cross-sectoral manner. AuroraAI will serve in all life events and business events, regardless of the time in a uniform and ethical way.

Speaker: Professor Tommi Mikkonen & Aleksi Kopponen
Affiliation: Helsinki University & Ministry of Finance

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

ODE2VAE: Deep generative second-order ODEs with Bayesian neural networks

Date: February 10, 2020

Abstract: Recently, there has been a growing interest in solving ordinary differential equation (ODE) systems using function approximators, e.g., Gaussian processes or neural networks. Such models are proven useful for modeling continuous-time dynamic phenomena and also interestingly connected with very deep networks with skip connection, flow-based deep generative models and reinforcement learning.
In this talk, I’ll present an overview of our NeurIPS paper from last December: ODE2VAE, a latent second-order ODE model for high-dimensional sequential data. ODE2VAE can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics. In particular, the talk will focus on the advantages of ODE modeling over discrete counterparts, why latent modeling is useful and the benefits of being Bayesian in this setting. To complete the picture, I’ll briefly give the historical context, draw connections with related techniques, and discuss exciting future directions.

Speaker: Yildiz Cagatay

Affiliation: Aalto University, Department of Computer Science

Place of Seminar: Lecture Hall T6, Konemiehentie 2, Aalto University

Environmental impacts of novel food production technologies

Date: February 03, 2020

Abstract: The global food systems are facing the challenge of providing healthy and adequate nutrition to the world’s growing population in a sustainable way. Food systems are one of the major causes of environmental change contributing to approximately 25% of the total anthrophonic greenhouse gas (GHG) emissions globally. Climate change is also affecting agricultural production as extreme weather events, such as heat waves, heavy rain periods, storms and floods, are becoming more frequent. Novel food production technologies, such as vertical farming, cell-culturing based food production (i.e. cellular agriculture) and meat substitutes made of plants, algae and fungi have gained wide interest in the past years as potential solutions for improving the sustainability of food systems. However, the consequences of wider scale adoption of those technologies are not known. This talk will present the current understanding of the environmental impacts of novel food production technologies, and outline ideas for further research to estimate the global potential of novel food production technologies to help with reaching climate targets and adapting to climate change. The aim of the talk is to initiate discussion (and possible research collaboration) about the possibilities of using artificial intelligence and/or machine learning in the modeling study outlined in the talk.

Bio: Hanna Tuomisto is an associate professor in sustainable food systems at the Helsinki Institute of Sustainability Science (HELSUS) and Department of Agricultural Sciences at the University of Helsinki. She leads the Future Sustainable Food Systems -research group. Her research interests are focused on estimating the potential of novel food production technologies and dietary changes to improve the sustainability of food systems. She has a strong experience in the development and use of environmental sustainability assessment methods, such as life cycle assessment and carbon footprinting.
Tuomisto holds an MSc degree in Agroecology from the University of Helsinki and a doctoral degree from the University of Oxford. In her doctoral degree, she compared environmental impacts of organic, conventional and integrated farming systems. After finishing her doctoral degree, she worked four years as a postdoctoral researcher at the European Commission’s Joint Research Centre (JRC) where she was involved in projects that developed carbon footprint and environmental footprint methods for agriculture and food sector. In 2016-2017, Tuomisto worked as postdoctoral researcher at London School of Hygiene & Tropical Medicine (LSHTM) where her work focused on the links between environmental change, nutrition and health.

Speaker: Hanna Tuomisto

Affiliation: Associate professor, Department of Agricultural Sciences & Group Leader, Future Sustainable Food Systems

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Quality of Analytics as an Approach for Optimizing ML Systems: Initial Results and Roadmap

Date: January 27, 2020

Abstract: Machine learning (ML) systems need to be optimized in an end-to-end system manner, not just for ML algorithms and models. Therefore, recently, the role of software systems and underlying distributed computing platforms and their intersections with ML has been discussed intensively in systems and ML communities. Various abilities of ML systems, such as robustness, reliability and responsiveness, rely on the capabilities of underlying computing and data platforms, and optimizing ML systems requires a strong integration with software and systems engineering techniques. In our work, we are interested in addressing challenging runtime issues in the robustness, reliability, resilience and elasticity (R3E) of end-to-end ML systems. In this talk, we will discuss the view of quality of analytics (QoA) and the principles of elasticity engineering for big data and cloud computing that can be applied to ML systems. We will present initial results of applying QoA for ML pipelines and discuss a long-term view research activities to support QoA for runtime optimization of ML systems.

Speaker: Hong-Linh Truong

Affiliation: Associate professor of Computer Science Departments, AaLto University

Place of Seminar: Lecture Hall T6, Aalto University

A maximum likelihood approach for modelling viral evolution using genome sequence data

Date: January 20, 2020

Abstract: Viruses evolve very rapidly, on clinically relevant timescales. HIV evolves around repeated attacks from the host immune system to eventually cause AIDS. Seasonal influenza strains change from year to year, requiring updates to the influenza vaccine. New influenza strains can evolve from viruses that infect other species, causing global pandemic disease. Genome sequence data can provide an insight into all of these processes. Where evolution proceeds over a sufficient period of time, phylogenetic methods provide a framework for the analysis of data, but for shorter periods such approaches are inadequate. We here outline a maximum likelihood approach, using short-read data to evaluate the state, and over time the evolution, of a viral population. We illustrate the use of the method in understanding the potential evolution of pandemic influenza and its application to sequence data from a case of long-term influenza infection.

Speaker: Dr Chris Illingworth

Affiliation: Professor of Department of Genetics, University of Cambridge

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Content Based Recommendation Engines’ tech building blocks, with particular focus on word embedding models and user experience

Date: Jan 17, 2020 (9:15 – 11:00)

Abstract: Iris.ai is tackling the problem of an exponentially expanding corpus of scientific knowledge, and limited human ability to manually sort, navigate and review it. Old school solutions are based on limiting key words and human-made taxonomies, revolving around a biased citation system and with results presented in endless and unstructured lists. The Iris.ai technology, using key concept extraction, contextual synonym enrichment, neural topic modeling and word importance-based document similarity, allows us to build intuitively meaningful content-based indexes. These indexes and their corresponding content are turned into intuitive, helpful visualizations for improved overview and navigation. This poses a novel and more efficient way of performing systematic research landscape mappings, reducing manual labor by 78%.

Bio: Victor Botev is the CTO and Co-Founder of Iris.ai, preciously a researcher from Chalmers University of Technology. He studied individual Master’s degrees in Artificial Intelligence
and Computer Systems and Networks at Sofia University St. Kliment Ohridski and Chalmers, respectively. After his degrees he stayed at Chalmers, where he conducted research on clustering and predictive neural network models and the usage of signal processing
techniques in studying Big Data. Victor has put his unique combination of AI research, software development lead and ambitious vision to the ultimate test at Iris.ai.

Speaker: Victor Botev
Affiliation: CTO Iris AI

Place of Seminar: Lecture Hall T2, Konemiehentie 2, Aalto University

A new look into teaching AI and Machine Learning: Lessons from the Elements of AI

Date: January 13, 2020

Abstract: AI and machine learning are some of the hottest topics both inside the academia and in the “real world”. These topics have been taught for decades and there exist standard syllabi for introductory courses. However, as AI is shaping every aspect of our society, we need to change the ways in which it is taught. The Elements of AI is an exceptional AI course because it doesn’t require any prerequisites in computing or mathematics beyond basic arithmetics.

When we started the Elements of AI, there was no guarantee that our approach would be viable. Perhaps we would only end up exacerbating the hype and spreading misconceptions instead of resolving them. After 1.5 years, five language versions (soon to be >20), and over 300 000 users, we can conclude that yes, the approach works: it is possible to learn to understand AI without knowing any programming.

In this talk, I describe the basic concept, our pedagogical thinking, and some of the lessons learned. The talk hopefully encourages others to engage, one way or another, in similar initiatives.

Speaker: Teemu Roos
Affiliation: Professor of Computer Sciences, Helsinki University

Place of Seminar: Lecture Hall T6, Konemiehentie 2, Aalto University