University of Twente Student Theses


Deep learning for the semantic segmentation of airborne laser scanner data: training data selection

Umalkar, Dhananjay (2021) Deep learning for the semantic segmentation of airborne laser scanner data: training data selection.

Link to full-text:
(only accessible for UT students and staff)
Abstract:Deep learning is the most powerful technique for extracting information from massive geo-information data. Typically, geo-information data is presented in raw form, requiring analysis, interpretation, and conclusion by a human to extract information that can be used for decision making later. The airborne laser scanning (ALS) data recorded with laser scanners describes [X, Y, Z, Intensity], but not the labels that indicate if a point belongs to a certain class, e.g., vegetation, ground, building, or water, etc. To extract information from ALS data, an analyst must view the data and mark each point as vegetation, ground, building, or water, etc. Even for data from a small area, a point cloud may contain millions of points that must be annotated. Manually labeling these locations ensures the data's authenticity; however, labeling these points requires studious labor and can take considerable time even when many teams are assigned to do so; as a result, the project's duration and cost increase. Deep learning is an excellent strategy since it can label points with an accuracy that is very near to that of humanly labeled points. The frameworks for deep learning are trained using pre-labeled point clouds. Model training is the most critical step in deep learning-based data prediction. Prediction accuracy is entirely dependent on the quality of training. The higher the quality and quantity of training data, the more accurate the prediction. On the other hand, training a model with a large amount of data could take days. Additionally, not all data is equal in relevance; training the model with just high-quality data enables it to produce better results without spending as much time using massive data and computing power. This research, which is based on experiments, establishes relationships between sample location, sampling methods, sample size and classification accuracy for effective deep learning model training. Additionally, it recommends the optimal sampling method, the number of samples to use, and the location of these samples, ultimately resulting in high-quality model training with optimal data and training time. Finally, this research automates the sampling process by designing an automation algorithm that simplifies and makes this process effortless.
Item Type:Essay (Master)
Faculty:ITC: Faculty of Geo-information Science and Earth Observation
Programme:Geoinformation Science and Earth Observation MSc (75014)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page