Research proposal DACCOMPLI granted by NWO

Recently, the research proposal “DACCOMPLI: Dynamic Data Analytics through automatically Constructed Machine Learning Pipelines” has been granted by NWO within the research program COMMIT2DATA Horizontaal 2016. For the execution of the project, a PhD candidate will be appointed in the group of Assistant Professor Joaquin Vanschoren.

About the project
The aim of the research project, which is a collaboration between TU Eindhoven, Leiden University (main applicant), Honda Research Institute Europe, and TU Delft, is developing a platform for dynamic data analytics that is based on techniques for automatically constructing data analytics pipelines for the task at hand.

The project will develop algorithm configuration approaches for composing, configuring, and parameterizing such pipelines from scratch – thereby automatically generating the best solution method for the application task at hand. For decision making, multiple objective optimization will then use the resulting models to generate optimal decisions in each application.

M&CS research
The part of the project our M&CS researchers will focus on, is the auto-optimization of data analytics pipelines. They will be answering the following research questions:

Research Question 1.1: How can we learn from previous data analytics pipelines? Pipelines constructed on real-world datasets have exceedingly large and complex design spaces (i.e. many possible operators, many hyperparameters). How can we leverage previous pipeline evaluations to more efficiently find promising pipeline/hyperparameter configurations, and eliminate uninteresting ones.

Research Question 1.2: How can pipelines be adapted to evolving data? Current state-of-the-art optimization techniques optimize hyperparameters on static data. The challenge here is to develop new techniques that continuously evolve their hyperparameter settings with the evolving data. Likewise, pipelines should adapt to evolving data by re-optimizing the selection of operators (e.g. preprocessing or learning algorithms).