Towards Annotation-affordable, High-level, and Holistic Scene Understanding

Advances in Perception for Automated Mobile Systems

January 26, 2023

Chenyang Lu defended his PhD thesis at the department of Electrical Engineering on January 25th.

In pursuit of increased efficiency and safety in the transportation of goods and people, there is a need for more capable automated mobile systems that can navigate through the world and interact with humans. One of the key enabling technologies required by such systems is perception, i.e. the ability to sense, perceive, and understand the environment where the system is situated in. The within this PhD research obtained promising results are expected to make solid contributions towards the realization of fully automated mobile systems.

For instance, for a car to be able to understand the real world, it needs to ascertain what to do with all kinds of sensory input from cameras, radar systems, and other devices. Whilst there are already multiple options that exist to perform these tasks, using artificial intelligence or not, these options are and depend on a huge amount of human interference before they can be put to good use. It became quite clear that there was still room for improvement by using artificial intelligence to create new possibilities which can ultimately deliver more than the traditional capabilities of computers. This dissertation addresses several detailed research objectives that aim to improve perception technology, in the fields of artificial intelligence and computer vision. These research objectives originate from the observation that current perception technologies are facing specific limitations.

More high-level environment representations

The first limitation is that current approaches that understand the environment usually remain at a low abstraction level, e.g. pixels, and therefore it is inefficient to directly use them for high-level tasks. This dissertation presents research on how deep neural networks can directly provide more high-level environment representations that are useful for downstream tasks of automated mobile systems, such as route planning and navigation. The proposed approach can take a regular front-view image as input, and efficiently generate an environment description, in the format of graphs, which are commonly used in almost all map and navigation apps. These graphs can be easily understood by humans, as well as downstream navigation tasks.

Complement incomplete observations

Furthermore, it is understandable that the scene we see in the real world is not complete. Some foreground objects will occlude the background environments, and we cannot observe the occluded background directly if we do not move ourselves and observe from another perspective. This type of incompleteness is of course not preferred when making the decision. This dissertation proposes methods that can complement incomplete observations, such that the mobile system gains an understanding of areas that it cannot observe directly with its sensors.

Finally, to train state-of-the-art deep neural networks, one needs large-scale and annotated datasets that are typically very expensive to obtain. Efforts are made to reduce the burden of manual data annotation for the proposed deep neural network-based methods. By doing this, we do not need to hire human annotators to enable the designed functionalities, which is typically very expensive.

This dissertation contains both fundamental contributions in combination with more applied contributions that are relevant to automated mobile systems, such as self-driving vehicles.

Title of PhD thesis: Advances in Perception for Automated Mobile Systems. Supervisors: Gijs Dubbelman, and Peter de With.

Media Contact

Rianne Sanders

(Communications Advisor ME/EE)

J.J.M.Sanders@tue.nl