University of Twente Student Theses


Deep learning on MLS point clouds : feasibility of improving semantic segmentation result using part segmentation

Yousefimashhoor, S. (2022) Deep learning on MLS point clouds : feasibility of improving semantic segmentation result using part segmentation.

[img] PDF
Abstract:Smart cities are interested in using the latest sensing and processing technologies to increase their operational efficiency. Two cutting-edge technologies, mobile laser scanning (MLS) and deep learning (DL) can be utilized to serve this goal. On the one hand, as a rapid data acquisition method, mobile laser scanning (MLS) technology provides highly accurate geometry data in the form of dense point clouds. On the other hand, 3D deep learning (DL) algorithms operating on point clouds can provide the 3D understanding and answer many questions in academia and industry. Therefore, in this study, we work at the intersection of MLS and DL to investigate the quality of urban asset inventory. The deep learning models are typically designed to work with an equally-distributed dataset, a characteristic that is naturally absent in an urban scene. Therefore, in most DL methods, underrepresented classes like urban furniture (such as poles, signs, benches, trashcans, etc.), with less than 1% share in the training and inference dataset, are considered noise and not segmented properly. To address this issue, a two-step pipeline is proposed and tested. At first, semantic segmentation provides a coarse prediction of the shape and location of the asset of interest. Then a more refined segmentation is achieved with part segmentation on the local neighborhood of the detected asset. The idea is inspired by a previous study showing that part segmentation can properly decompose isolated pole-like objects into their constituent parts(Yousefimashhoor, 2022). In this research, we are investigating whether a trained part segmentation model can be used to improve the semantic segmentation result in a local neighborhood. To study the feasibility of the proposed pipeline, a pilot class of interest (pole), a state-of-the-art deep learning algorithm on point clouds (KPConv), and publicly available data for test and training (the NPM3D dataset) are chosen. The KPConv deep learning model is trained for two tasks, one for the scene semantic segmentation and the other one for part segmentation on poles’ local neighborhoods. The trained models are used to infer labels from raw point clouds, and their output is compared to each other within the same spatial extents, once on tile level and once after inserting the tiles back to the original prediction point cloud, on the scene level. Also, some method adaptations are investigated to improve the results, such as enlarging the tiles or including intensity values. Finally, the two models are integrated into one pipeline to simulate the real-world scenario where the input of the part segmentation task is the output of the semantic segmentation result. The results show that the on perfect tiles, the best adaptation of our proposed method, can achieve a maximum of 6.4% improvement in the IoU of pole class at the tile level and improve the result of semantic segmentation of poles by 1.6% in the whole scene. The qualitative inspection of the results reveals that the part segmentation model performance for each tile can be different and is dependent mainly on pole topological relation with other classes (isolated, intersecting, or attaching) and the point density of the tile. The simulation of a real-world scenario on our proposed pipeline shows that the neighborhood tile selection plays a crucial role in achieving the best result. With the imperfect tiles extracted based on semantic segmentation results, a minor improvement in recall (around 1%) and a significant decline in precision(around 5%) is achieved, which stems from the bias in training set toward the object of interest. The limitations of the proposed method, namely, the destructive role of semantic segmentation errors in escalating false predictions in the part segmentation, the limited contextual information due to the tile selection, and the model bias towards the pole class leading to a huge false positive rate confirm that the problem that we have investigated is not easy to solve. There is no clear-cut solution that could work in all scenarios. In this light, our work helps to clarify the specific approaches that can be viable and the pitfalls that should be avoided.
Item Type:Essay (Master)
Faculty:ITC: Faculty of Geo-information Science and Earth Observation
Subject:54 computer science, 74 (human) geography, cartography, town and country planning, demography
Programme:Geoinformation Science and Earth Observation MSc (75014)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page