# Algorithms for clean, complete data to manage mobility

December 19, 2022

Bram Custers seeks to provide algorithms that give rise to smart cities.

Nowadays, close to all vehicles are equipped with electronic devices to be able to measure where they are. Using the data that these devices produce, we can set out to improve the way we construct and manage the mobility in our cities, giving rise to so called ‘smart cities’. Commonly, we use data that are of spatio-temporal nature, that is, the data measure the paths a vehicle is driving by collecting samples of its location at fixed intervals of time. We commonly refer to a sequence of these measurements for a single vehicle as a trajectory of the vehicle.

To be able to use this data, we need to devise ways to process the trajectories into meaningful knowledge. A major challenge that arises when working with these trajectories is to get clean, complete data. The measurements of our trajectories are subject to noise after all, but the devices may be faulty at times as well. Another challenge is to visualize and summarize the vast amount of data that is available. Traffic in a city consists of a lot of vehicles, and if we have a lot of measurements per trajectory, this quickly becomes a huge amount of data to process. To address these challenges in a meaningful way, PhD candidate Bram Custers seeks to provide algorithms that consider the context of the trajectories: the environment and setting in which the trajectories were measured. He defended his thesis on Friday 16 December at the department of Mathematics and Computer Science.

### Means to cleanup trajectories

In his thesis, Custers provides novel algorithms and approaches that utilize different types of context for the trajectories. To detect outliers in trajectories, he introduces a model that considers the consistency of the trajectories with physical reality. He considers models where he bounds the maximum speed and/or the maximum acceleration of the entity that underlies the measured trajectory. Using these models, he deems measures to be outliers if they are not part of the largest consistent subtrajectory. This gives a means to cleanup trajectories. Custers analyzed the performance of resulting algorithms by employing a real-world data set and compares the resulting trajectories with results from benchmark and state-of-the-art algorithms. Furthermore, he used physics models to tackle the problem of gap-filling a trajectory.

Next, Custers considered the problem of reconstructing routes in a road network. By combining different data sources that describe the traffic situation in the network, it is possible to infer more information about the underlying traffic that is represented by these data sources. Particularly interested in reconstructing routes, Custers employed a set of GPS trajectories representative of the traffic and checkpoint data: measurements of the traffic volume at fixed locations in the road network. Since it is hard to find fast algorithms for these problems, he considered heuristic approaches to the problem, applied these to a real-world data set and compared them to baseline algorithms to see how well the approaches perform.

### Easily see major patterns in the data set

Custers wanted to visualize a large set of trajectories in a summarized way, allowing to easily see major patterns in the data set. To this end, he applied schematization: the act of simplifying data for visualization, where the summarizing power is favored over high-fidelity reproduction of the data. He used the underlying road-network to provide context for the visualization. Also, he presented a pipeline that produces the schematization, which consists of simple to grasp steps that modify the road network and set of trajectories simultaneously. Finally, Custers derived summarized patterns from the trajectories, visualized in a metro map style on the schematized road-network. He explored the result of this schematization on two different real-world data sets, which show that the approach seems promising for distinguishing different patterns of mobility.

Title of PhD thesis: Accelerated Verification of Concurrent Systems. Supervisors: prof.dr. B. Speckmann and dr. K.A.B. Verbeek.

## Media contact

Anke Langelaan
(Communication Officer)