Data Science Summit 2018

Tuesday November 27, we had a very busy summit, this year in the Muziekgebouw in Eindhoven, which turned out to be a perfect location. We expected a large crowd, as we were already sold-out two weeks in advance, and we had a full room during the morning program. We kicked off the day with our brand-new Data Science movie and encouraging words from our rector magnificus, Frank Baaijens.

The audience learned from Daniel Keim’s inspiring keynote about the power of visual analytics, combining the strength of both computers and humans in analysis and decision making. Interestingly enough, this was also one of the key messages in the last lecture, which reflected on responsible data science and the importance of the human in the loop in fraud detection projects. During the lectures on our research programs, the audience learned first-hand about excellent, industry inspired research at the Eindhoven University of Technology. Examples were given in diverse domains, including service logistics, fog computing, heartbeat analysis and data driven training programs in sports.

Many of the posters presenters took the opportunity to very shortly pitch their posters in the plenary program. This resulted both in a very energetic session, as well as in a lot of attention for the posters themselves. With well over 50 posters, this was a vibrant place to discuss concrete projects more in depth. This also was the moment that new contacts were established, hopefully resulting in many new projects that you all can see in next year’s summit!

The evaluation was very positive, with a 7.8 overall score (out of 10). Almost all talks received a 4+ (out of 5) score, with the keynote most highly valued at 4.6. The poster session and the format of duo-talks received similar positive feedback, so we will see this back next year. We, the organizing committee, were very pleased with the program and the highly involved visitors and already look forward to this year’s edition.

Please use this link to download all presentations, the brochures, the poster booklet and over 200 Summit pictures. We look forward to see you all again next year!

Quantified Self

Data Science in elite and recreational sports

•    Arno Knobbe (Leiden University)
•    Aarnout Brombacher (TU/e, ID)

One of the more interesting application areas in the field of “Quantified Self” is the field of sports. While in many application domains the (new) rules and regulations on privacy (such as the recent GDPR) make it more difficult to obtain (bodily) data from real people in real life this is only true to a much lesser extend for the field of “sports”.  Sporters, both recreational sporters as well as elite-sporters, do not mind sharing data for research purposes provided that improves their performance (especially for elite sporters), their health and/or the health of others. In this lecture dr. Knobbe will give an overview of research that is taking place at Leiden University (LIACS) in the field of data-mining and data analytics of data obtained at (elite-)sporters. Prof. Brombacher will give an overview of the national activities in the field of (big-)data and sports as they are currently being developed on a national level by the TopTeam Sports and Eindhoven University.

Biography Arno Knobbe
Dr. Arno Knobbe works as assistant professor at the Leiden Institute of Advanced Computer Science, where he heads a research group that focuses on data science in sports.  His work is aimed at injury prevention and performance optimization by means of detailed sports data captured by athletes, coaches and embedded scientists. Focusing on elite sports, his group has worked with the national men’s volleyball team, cycling and speed skating team Lotto-Jumbo, the Leiden Marathon, and the women’s national soccer team (who happened to win the European Championship). Knobbe also heads the Sport Data Center, a cooperation between research institutes from Leiden, Amsterdam, Delft and Twente.

Biography Aarnout Brombacher
Aarnout Brombacher was appointed 1-7-1993 as full professor at Eindhoven University of Technology. He currently is professor in “Design theory and information flow analysis” in the faculty Industrial Design of Eindhoven University of Technology. With this chair he is responsible for research and education in the fields Quality Information Flows and Customer Perceived Quality in highly innovative product design and development processes, especially on products, systems and services that relate to (recreational) sports and vitality. He has authored and co-authored over 100 journal papers and is, on university level, coordinator of the (university) program People, Sports and Vitality and member of the national TopTeam (= advisory body to the cabinet) on Sports and Vitality. From March 1st, 2010 until March 1st 2018 he was also Dean of the department Industrial Design.

Health Analytics

Probabilistic modeling of PPG signals with application to premature beat detection

•    Reinder Haakma (Philips Research)
•    Paulo Serra (TU/e, M&CS)

In this work we introduce a Bayesian model for photoplethysmography (PPG) signals. Photoplethysmography is an unobtrusive, inexpensive, optical technique to measure blood volume changes in cardiovascular tissue. PPG signals are acquired by a device typically placed on the wrist or  nger. The signal carries information about more than just heart rate and can thus be used for extensive analyses of health status in a broader sense. This makes PPG signals very informative, but also complex. Our approach decomposes the PPG signal by extracting a so called baseline wandering component. This component is connected with respiration. The remainder of the signal is quasi-periodic and is decomposed into a set of pulses that are modelled using a Bayesian model. The resulting method is self-contained, automated and provides a good compression of the data while keeping the rich information of the PPG signal. We illustrate our approach by showing how it can be used to automatically detect premature beats in patients. This is joint work with M. Regis and E.R. van den Heuvel (Eindhoven University of Technology, Philips), and L.M. Eerikainen (Philips).

Biography Reinder Haakma
Reinder Haakma (MSc Electrical Engineering, 1985, University of Twente; PhD, 1998, Technical University Eindhoven) joined Philips in 1987. From 1997 onwards he has been scientist and team leader of various research activities in personal health, wearable technology and out-patient monitoring. He is involved in a number of public-private partnership collaborations. His expertise includes user‐system interaction, embedded user interface architectures, behavior and physiological modelling and analysis, and unobtrusive sensing technologies.

Biography Paulo Serra
Dr. Paulo Serra currently is assistant professor at the Stochastics group of the Department of Mathematics and Computer Science of TU/e. He is also involved in the Health Analytics and the Smart Manufacturing & Maintenance research program of DSCE. He was a postdoc at the University of Amsterdam, working as part of the NETWORKS project, funded under NWO's Zwaartekracht grant.

Paulo’s research interest is wide, focusing on (non-arametric) Mathematical Statistics. Most of what he does involves Markov processes and Markov chains, Markov chain Monte Carlo and Poisson processes. A lot of work is theoretical but he is very interested in implementation of algorithms and their practical applications, programming and on-line (recursive) algorithms.

Internet of Data

Future-proof modeling of systems - optimizing distributed resources in the age of fog computing

•    Remco Schoenmakers (Thermo Fischer)
•    Georgios Exarchakos (TU/e, EE)

Thermo Fisher Scientific’s electron microscopes, like many other sensors, produce massive amounts of data at an ever increasing data rate. In order to collect the best possible data, immediate analysis of this data and even feedback to the instrument is necessary, but compute resources near the microscope are scarce and outdate rapidly. At the same time, software is evolving as well. How can we (re)distribute data processing tasks dynamically in such a way that the total cost (in $’s) is minimized? How can we make this a future-proof system that adapts if new hard- or software becomes available? In essence, network and compute resources should be dynamically allocated to a task given the load, local and remote capacity as well as their attached usage cost. Fog computing is a paradigm that allows IoT achieve stringent non-functional requirements (e.g. latency) at a high price on complexity. While years of research on model-based design have given powerful solutions, uncertainties in vastly heterogeneous and distributed systems are largely ignored. Evolutionary fog systems are able to integrate usage, application, network-driven uncertainties and continuously balance between resource utility, user experience and cost.