Distinguished Lecture Series

Helsinki Distinguished Lecture Series on Future Information Technology

HIIT is the organiser of the Helsinki Distinguished Lecture Series on Future Information Technology. The focus of this lecture series is on the research challenges and solutions faced by current and future information technology, as seen by internationally leading experts in the field. The lectures are intended to be approachable for people with scientific education in fields other than information technology, whilst at the same time providing information technology experts new viewpoints to their own discipline. The series was launched in 2012, and in 2019 it continued with 5 lectures.

Modeling with Machine Learning: Challenges and Some Solutions

Tommi Jaakkola

MIT Computer Science and Artificial Intelligence Laboratory

January 16th, 2019

Video

Abstract

Machine learning has become an integral part of engineering and sciences. It is rapidly extending the horizon of modeling, offering to accelerate / bypass detailed simulations, create statistical (and causal) linkages and complex inferences about entities whose underlying relations may not yet be well-understood. This broad integration with other disciplines also brings back new machine learning challenges, ushering advances in core capabilities. For example, in the context of molecular design, we need to be able to learn to modify, combine and synthesize highly structured objects (molecules). I will illustrate this with our recent efforts to transform chemistry, especially in terms of forward synthesis (reaction outcome) and targeted molecular optimization (drug design). Another challenge that comes on the heels of complex machine learning approaches is that the resulting solutions can be opaque, difficult to understand and/or verify. Attempts to explain the behavior of complex models after the fact, i.e., after they have been trained, can be unstable and unsatisfactory. I will highlight some of our recent work on self-explaining models where transparency is forced upon the models already as they are being trained. The two parts of the talk mimic some of the overall challenges of delivering broadly applicable, explainable AI.

About the Speaker

Tommi S. Jaakkola is the Thomas Siebel Professor of Electrical Engineering and Computer Science and the Institute for Data, Systems and Society (IDSS) at MIT. He received M.Sc. in theoretical physics from Helsinki University of Technology, Ph.D. from MIT in computational neuroscience, and joined the MIT faculty 1998. His research focuses on both foundational theory as well as applications of machine learning with the goal of delivering algorithms that operate at scale in an efficient, principled, and interpretable manner. The applied side of his work involves multi-faceted recommender, retrieval, or inferential tasks (e.g., biomedical), design and optimization of molecules or reactions for the purpose of drug design, and modeling strategic, game theoretic interactions. He has received many awards for his publications.

Data-Driven Genomic Computing: Making Sense of the Signals from the Genome

Stefano Ceri

Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano

March 15th, 2019

Video

Abstract

Genomic computing is a new science focused on understanding the functioning of the genome, as a premise to fundamental discoveries in biology and medicine. Next Generation Sequencing (NGS) allows the production of the entire human genome sequence at a current cost of about 1000 US $; many algorithms exist for the extraction of genome features, or “signals”, including peaks (enriched regions), variants (mutated DNA sequences), or gene expression (intensity of transcription activity). The missing gap is a system supporting data integration and exploration, giving a “biological meaning” to the available information; such a system can be used, e.g., for precision medicine, which aims at assigning the best treatment to each patient.

In this talk, I will describe a new data-driven framework for extracting and integrating genomic features, which is made available to the scientific community; in this work, we use foundational data management abstractions, with the objective of simplifying and improving over many low-level bio-informatics tools currently in use. We developed a new query language and system for managing genomic datasets on the cloud, with programmatic interfaces for R and Python; we also developed a repository which integrates open data produced by large international consortia, after designing and extracting a common core of semantically aligned metadata. In my talk, I will also hint to some big data management problems that we face for providing optimized data access in the cloud, and to biological and clinical applications that have been developed by using our systems. The framework internally uses the Spark big data engine and can be accessed at Cineca in Italy and at the Broad Institute in Cambridge (US).

This work is funded by an Advanced ERC Grant, “data-driven Genomic Computing” (GeCo).

About the Speaker

Stefano Ceri is professor of Database Systems at the Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB) of Politecnico di Milano. His research work covers four decades (1978-2018) and has been generally concerned with extending database technologies in order to incorporate new features: distribution, object-orientation, rules, streaming data; with the advent of the Web, his research has been targeted towards the engineering of Web-based applications and to search systems. More recently he turned to genomic computing. He authored over 350 publications (H-index 76) and authored or edited 15 books in English. He is the recipient of two ERC Advanced Grants: “Search Computing (SeCo)” (2008-2013), focused upon the rank-aware integration of search engines in order to support multi-domain queries and “Data-Centered Genomic Computing (GeCo)” (2016-2021), focused upon new abstractions for querying and integrating genomic datasets. He is the recipient of the ACM-SIGMOD “Edward T. Codd Innovation Award” (New York, June 26, 2013), an ACM Fellow and a member of Academia Europaea.

Searching for Experiences

Alon Halevy

May 27th, 2019

Abstract

The rise of Artificial Intelligence raises an important challenge: how can we develop AI that increases the well-being of its users? This talk will describe a set of projects that address this issue and raise challenges for Natural Language Understanding and data management. The main theme of these projects is to help users create experiences that make them happy. The first project, Jo, is a smart-journaling application that allows users to enjoy the insights of the field of Positive Psychology in the context of their own lives. Users log their important moments via short texts and Jo attempts to give them insights that help them take steps toward creating more positive moments in their lives. The second project helps users create positive experiences when they shop online for services. The observation underlying our project is that while users are searching for experiences (e.g., restaurant outings, vacations), online services only enable them to search based on objective non-experiential attributes. Voyageur introduces the idea of a subjective databases, which enable users to query directly for experiential aspects.

About the Speaker

Alon Halevy was the CEO of Megagon Labs until December, 2018. Previously, Alon led the Structured Data Research Group at Google for 10 years and before that he was a professor of computer science at the University of Washington. Alon is a founder of Nimble Technology, and of Transformatic, Inc., which was acquired by Google in 2005. Alon is the author of two books: “The Infinite Emotions of Coffee” and “Principles of Data Integration.” Alon is an ACM Fellow, received the Sloan Fellowship and the PECASE Award. He received his Ph.D. in Computer Science from Stanford University in 1993.

Forty Years of Unsupervised Machine Learning

Erkki Oja

Aalto University

September 18th, 2019

Abstract

Unsupervised learning is a classical approach in artificial neural networks, pattern recognition and data analysis. Its importance is growing today, due to the increasing data volumes and the difficulty of obtaining labeled training data of sufficient quantity and quality, that could be used for supervised learning. The talk looks at the basic approaches during the past forty years, especially from the perspective of learning neurons and neural networks. A widely used methodology is linear latent variable models, such as principal component analysis, independent component analysis, and nonnegative matrix factorizations. They can be implemented in one-layer neural networks or shallow autoencoders. Mathematically, they can be represented as decompositions of the data matrix containing the unlabeled samples. In deep learning, nonlinear latent variables can be found by deep autoencoders. Another widely used classical methodology is clustering, which also has a relation to matrix factorizations. In self-organizing maps, the clusters are ordered in a specific way.

About the Speaker

Erkki Oja received the D.Sc. degree from Helsinki University of Technology in 1977. He is Past Director of the Computational Inference Research Centre and Distinguished Professor Emeritus of Computer Science and Engineering at the Department of Computer Science, Aalto University, Finland. He holds an honorary doctorate from Uppsala University, Sweden, as well as from two Finnish universities. He has been research associate at Brown University, Providence, RI, and visiting professor at the Tokyo Institute of Technology, Japan, and the Universite Pantheon-Sorbonne, France. He is the author or coauthor of more than 300 articles and book chapters on pattern recognition, computer vision, and neural computing, and three books: “Subspace Methods of Pattern Recognition” (Research Studies Press and Wiley, 1983), which has been translated into Chinese and Japanese; “Kohonen Maps” (Elsevier, 1999), and “Independent Component Analysis” (Wiley, 2001; also translated into Chinese and Japanese). His research interests are in the study of neural networks and unsupervised machine learning, especially principal component and independent component analysis, self-organization, statistical pattern recognition, and applying machine learning to computer vision and signal processing. Much of this work is highly cited.

Prof. Oja is a member of the Finnish Academy of Sciences, Finnish Academy of Technical Sciences, the Academia Europaea, Life Fellow of the IEEE, Founding Fellow of the International Association of Pattern Recognition (IAPR), Past President of the European Neural Network Society (ENNS), and Fellow of the International Neural Network Society (INNS). He is Past Chairman of the Finnish Research Council for Natural Sciences and Engineering and Commander of the Order of Lion of Finland. He is a member or past-member of the editorial boards of several journals and has been in the program committees of several recent conferences. Prof. Oja is a recipient of the IEEE Computational Intelligence Society Neural Networks Pioneer Award, the IAPR P. Devijver Award, the INNS Hebb Award, and the IEEE Frank Rosenblatt Award.

Digital Health: Back to the Future

Sumi Helal

Lancaster University

December 19th, 2019

Abstract

The vision and hope of Digital Health is to transform the current fragmented and reactive primary care system (a point-of-care paradigm) into an integrated, proactive Health Navigator – a continuum-of-care paradigm capable of providing personalized and timely guidance and just-in-time interventions, while availing real-time, individual- and population-level health information to individuals, healthcare organizations, governments, and policy makers. In this talk, I will provide a critical review of the key advances achieved in the past 20 years but also the mistakes, missed opportunities, and in general lessons learnt. I will present 7 key such lessons and close by summarizing the challenges ahead and potential paths forward.

About the Speaker

Sumi Helal, is professor and Chair in Digital Health at Lancaster University, UK, where he leads interdisciplinary research initiatives in digital health in both the School of Computing and Communications (Faculty of Science and Technology) and the Division of Health Research (Faculty of Health and Medicine). As Director of Lancaster University’s Center on Digital Health and Quality of Life Technologies, he leads several active projects on Connected Health Cities, Healthy New Towns design and implementation, suicide prevention using cybernetics and analytics, Airport Accessibility for the hearing impaired, and intelligent primary care GP-Patient interactions. He is a board member and lead of the digital health infrastructure and strategies in the Fylde Whyndyke Garden Village – one of ten NHS England Healthy New Towns development project (a 1400-unit, green grass development which provides for a unique opportunity to embed health elements, by design, in public areas, neighborhoods, and the town community hub (school, wellness center and health care facility), to promote health and wellbeing, active and healthy living and ageing, prevent illnesses and improve people’s quality of life.

Before joining Lancaster, Prof Helal was a Computer & Information Science and Engineering Professor at the University of Florida, USA, and Director of its Mobile and Pervasive Computing Laboratory. He co-founded and directed the Gator Tech Smart House, a real-world deployment project that aimed at identifying key barriers and opportunities to make the Smart Home concept a common place (creating the “Smart Home in a Box” concept). His active areas of research focus on architectural and programmability aspects of the Internet of Things, and on pervasive/ubiquitous systems and their human-centric applications with special focus on smart spaces, proactive health/wellness, patient empowerment and e-coaching, and assistive technology in support of personal health, aging, disabilities, and independence. Professor Helal served as the Editor-in-Chief of IEEE Computer (2015-2018), the Computer Society’s flagship and premier publication. He currently serves as member of the Board of Governors of the IEEE Computer Society, and Chair of its Magazine Operational Committee. Professor Helal is a Boilermaker (Ph.D., Purdue University, class of 1991), Fellow of the IEEE, Fellow of the IET, and a 2020 IEEE Computer Society President-Elect nominee. Contact him at sumi.helal@ieee.org

Annual Report 2019

Helsinki Distinguished Lecture Series on Future Information Technology

Modeling with Machine Learning: Challenges and Some Solutions

Abstract

About the Speaker

Data-Driven Genomic Computing: Making Sense of the Signals from the Genome

Abstract

About the Speaker

Searching for Experiences

Abstract

About the Speaker

Forty Years of Unsupervised Machine Learning

Abstract

About the Speaker

Digital Health: Back to the Future

Abstract

About the Speaker