M&CS student finishes second in international algorithmic programming contest
Over the last few days, the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems conference took place. In conjunction with the conference, the GIS Cup 2017 is held; an algorithmic programming contest. Tom van Diggelen, M&CS Master’s student, finished second place in this contest. He was co-supervised by Assistant Professors Wouter Meulemans and Kevin Buchin, and assisted by Yago Diez Donoso, Assistant Professor at Tohoku University.
This year’s contest focused on the growing research area of trajectory computing. Massive amounts of spatial trajectories are generated nowadays by smartphones, infrastructure, computer games, natural phenomena, and many other sources. An important task for many geographic information systems (GIS) is to find trajectories that are similar or close to some given trajectory. For this contest, the goal is to answer many such queries for a large database of trajectories, concentrating on a very intuitive measure called Fréchet distance1.
Tom’s solution combines spatial hashing2, simplification3, and optimized decision algorithms4 to arrive at an algorithm that can answer queries by inspecting only a small number of trajectories in detail: in particular, only those trajectories that are really close to the threshold of being “similar enough” go through detailed, comparatively costly computations.
Next to winning a monetary prize, Tom van Diggelen was invited to submit a four page paper, which has been published in the conference proceedings, and give a very well received presentation about his work in the contest session.
1. Fréchet distance: The Fréchet distance is formally defined using homeomorphisms between two trajectories, but can be explained more easily using the following analogy. Consider a man walking on one trajectory and a dog walking on the other. The Fréchet distance is the shortest leash length needed, such that the man and dog can both walk from start to finish while they are connected by such a (short) leash. The man and dog can vary their speeds but are not allowed to walk backwards. (See image: The two trajectories are colored red / blue (e.g. the man / dog) and some of the intermediate leashes are shown using the then green lines.)
2. Spatial hashing: We divide the geographic space spanned by the trajectories into square “buckets” and sort the trajectories into these buckets based on their end points. This allows us to quickly find trajectories that have end points near to a query trajectory (a necessity for being geometrically similar in this challenge), without having to iterate over all trajectories in the database: we need to collect only those in one or a small number of buckets.
3. Simplification: Trajectories may consist of a large number of (GPS) measurements, but the overall shape can often be described using only a small representative subset of these. Simplification is the process of doing a filtering of the measurements such that the remaining measurements still capture the overall shape well (up to some small error). Where possible, we use these approximations to decide whether two trajectories are similar enough. Since they contain fewer measurements, computations with these simplified trajectories is significantly faster.
4. Optimized decision algorithm: In 1995 Alt and Godau developed an algorithm that could decide whether the Fréchet distance between two trajectories is at most some given value. However, the running time of the algorithm behaves as a quadratic function of the number of measurements in the trajectories, which is problematic for long trajectories. We use an algorithm that has some practical speed ups and strategically re-uses information from the simplification steps, to avoid this quadratic behavior for most cases.