ENGLISH

27卷/1期

27卷/1期

華藝線上圖書館

Pages:

1-14

論文名稱

應用深度學習於高解析衛星影像臺灣農業區建物分塊

Title

A Deep Learning Approach for Building Segmentation in Taiwan Agricultural Area Using High Resolution Satellite Imagery

作者

劉良逸、王驥魁、黃安德

Author

Liang-Yi Liu, Chi-Kuei Wang, An-Te Huang

中文摘要

臺灣的可耕地面積有限,清查建物的面積有助於了解土地利用的狀況。為了瞭解建物在臺灣農業區所佔的總面積,現有的做法之一是透過高解析衛星影像進行人工辨識,此法可以掌握建物的邊界、改善現地調查的不便。然而,卻需要大量人力資源的投入。過去的研究顯示,深度學習的方法可以有效地在高解析衛星影像進行建物分塊。因此,本研究使用 ENVINet5 深度學習模型及 Pleiades 彩色融合影像進行訓練,針對臺灣的農業區進行建物分塊。因為各地區的建物型態皆不相同,所以本研究使用九個不同的縣市的影像進行訓練,每張訓練影像的尺寸為 2500 像素× 2500 像素。模型的評估是透過驗證集中的像素以及分塊後的建物多邊形進行計算。前者結果顯示,經訓練的模型可以找出 84%的建物像素;後者計算了建物多邊形的數量,並將其與參考建物以 IoU(Intersection of Union)做比較。成果顯示,該模型可以在影像上偵測且分塊 92%的建物,其 IoU 集中於 0.6 到 0.9 之間。該模型也以測試集做可轉移性試驗。另外,本研究提出了影像切圖與拼接的方法以處理大範圍的衛星影像。最後,我們將 ENVINet5 的成果輔助人工辨識建物,可以節省 7.3% 的時間成本。

Abstract

Understanding buildings in agricultural area is important because the arable land in Taiwan is limited. One of the practical ways is manual digitization from high resolution satellite imagery, which can acquire satisfying results without field investigation. However, such practice is tedious and labour intensive. Given these reasons, past research devoted to deep learning approaches have shown that convolutional neural networks are useful for building segmentation using satellite imagery. In this study, ENVINet5 model was trained and utilized from high resolution Pleiades pansharpened imagery. The training images (with the size of 2500 pixels × 2500 pixels) were randomly selected from 9 counties/cities to increase diversity because each county/city has different building patterns. The performance of ENVINet5 model was evaluated based on pixels and polygons, respectively. The pixel-based evaluation showed that the trained model can find 84% of building pixels. The polygon-based evaluation was carried out through calculating the number of building segments and comparing them with the reference data using IoU (Intersection of Union). The results showed that 92% of building segments were found, and the IoU of most building segments range between 0.6 and 0.9. The trained model was validated on the testing images for the transferability test. Moreover, an image tiling and stitching technique was proposed to deal with large satellite imagery. Finally, we compared the time costs of labelling with and without the aid of deep learning approach. The results showed that the time costs decreased by 7.3% with the help of deep learning approach.

關鍵字

建物分塊、深度學習、高解析衛星影像

Keywords

Building Segmentation, Deep Learning, High Resolution Satellite Imagery

附件檔名

華芸線上圖書館

https://www.airitilibrary.com/Publication/alDetailedMesh?DocID=10218661-202112-202201040011-202201040011-1-14

備註說明

N / A

Pages:

15-28

論文名稱

應用深度學習進行 MMS 點雲語意分割及影像分類產製高精地圖

Title

Applying Deep Learning to MMS Point Cloud Semantic Segmentation and Image Classification for HD Map Generation

作者

張芷瑀、曾義星

Author

Chih-Yu Chang, Yi-Hsing Tseng

中文摘要

本研究應用深度學習 (Deep Learning, DL) 的技術提出了一套完整的交通標誌高精地圖產製流程,以期能更自動化地從移動式測繪系統 (MMS) 所蒐集而來的光達及影像資料中,萃取出建立高精地圖的必要資訊,解決傳統產製方式耗時耗力的問題。首先,使用 PointNet 從點雲提取交通島、交通標誌、號誌和桿狀物。然後,通過 DBSCAN 算法對交通標誌點雲進行聚類,以獲得幾何資訊並評估其準確性。接下來,將每個交通標誌簇中的點雲投影到相應影像上進行分類。透過 GoogLeNet 及訊噪比確認其語意資訊。最後產出符合高精地圖交通標誌要求之格式,以供自駕車發展之使用。

Abstract

The ongoing race toward an autonomous era results in the development of High Definition (HD) Maps. To help extend the vision of self-driving vehicles and guarantee safety, HD maps provide detailed information about on-road environments with precise location and semantic meaning. However, one main challenge when making such a map is that it requires a massive amount of manual annotation, which is time-consuming and laborious. As such, to fulfill automation in extracting information from the sheer amount of data collected by mobile LiDAR scanners and cameras is at most concern. In this study, a workflow for automatically building traffic sign HD maps is proposed. First, traffic islands, traffic signs, signals, and poles are extracted from LiDAR point clouds using PointNet. Then, point clouds of traffic signs are clustered by the DBSCAN algorithm so that the geometric information can be obtained. An evaluation is performed to assess the accuracy of geolocation in the final stage. Next, point clouds in each traffic sign cluster are projected onto corresponding MMS images for classification purposes. The semantic attribute is obtained based on the GoogLeNet classifier and determined by a proposed mechanism, i.e. modified SNR ratio, which ensures the class with the most classified images is significant enough for that cluster to be considered as that specific type. An output text file including precise coordinates of traffic sign center, bottom-left, and top-right of the traffic sign bounding box also their type is generated for further use in HD maps.

關鍵字

高精地圖、移動式測繪系統、深度學習、點雲語意分割、影像分類

Keywords

HD maps, MMS, Deep Learning, Point cloud semantic segmentation, Image classification

附件檔名

華芸線上圖書館

https://www.airitilibrary.com/Publication/alDetailedMesh?DocID=10218661-202112-202201040011-202201040011-209-220

備註說明

N / A

Pages:

29-42

論文名稱

以單眼視覺SLAM建立場景三維特徵地圖輔助相機定位之定位精度與適應性分析

Title

Feasibility and Accuracy Analysis of Camera Localization Aided by a 3D Feature Map Established by Monocular Visual SLAM

作者

林緯程、饒見有

Author

Wei-Cheng Lin, Jiann-Yeou Rau

中文摘要

基於影像特徵點之視覺 SLAM (Visual Simultaneous Localization and Mapping, V-SLAM),以重複的三維特徵地圖建立、影像特徵至三維特徵地圖的匹配和空間後方交會過程定位相機。若其建立之三維特徵地圖能做場景先驗控制重複使用,則可解決無可靠 GNSS 訊號環境之定位問題。然環境光照條件常影響影像特徵點萃取和匹配結果,因此值得進一步探討不同光照條件下建立之三維特徵地圖,其相機定位成果對光照條件改變之適應性。本研究輸入不同光照條件下建立之三維特徵地圖至 ORB-SLAM 系統,藉其地圖再利用功能輔助相機定位。研究成果表明,三維特徵地圖輔助之相機定位,能獲接近參考地圖精度的相機定位精度,為無可靠 GNSS 訊號環境之定位問題的可能解決方案。

Abstract

The feature-based Visual Simultaneous Localization and Mapping (V-SLAM) systems localize cameras by repeatly establishing the 3D feature maps, matching the image features to the 3D feature maps and conducting space resection. By reusing the 3D feature maps established during SLAM processing, it can be considered as a control field to solve the GNSS-denied localization problem. However, the different lighting condition in the outdoor environments can possibly affect the results of image feature extraction and matching, which will lead to different localization results. This study applies ORB-SLAM to establish the 3D feature maps, and the map-aided camera localization is conducted under ORB-SLAM Localization Mode. The experimental results demonstrate the camera localization aided by a 3D feature map can reach close camera localization accuracy as the accuracy of its’ reference 3D feature maps. As a result, the camera localization aided by a 3D feature map can become a potential solution of GNSS-denied localization problem.

關鍵字

ORB-SLAM、三維特徵地圖、坐標轉換

Keywords

ORB-SLAM, 3D Feature Map, Coordinate Transformation

附件檔名

華芸線上圖書館

https://www.airitilibrary.com/Publication/alDetailedMesh?DocID=10218661-202112-202201040011-202201040011-221-227

備註說明

N / A

Pages:

43-60

論文名稱

利用拓樸約制條件協助LOD-2屋頂模型重建

Title

LOD-2 Roof Models Reconstruction Assisted by Topological Constraints

作者

周孟圻、饒見有

Author

Meng-Chi Chou, Jiann-Yeou Rau

中文摘要

隨著三維地理資訊系統的蓬勃發展,三維城市模型的需求與日俱增,而三維建物模型更是在其中扮演著非常重要的角色。本研究提出一套 LOD-2 三維屋頂模型之重建模式,利用無人機與空載雷射掃描產製點雲資料以提供三維坐標觀測量,並藉由二維多邊形資料以定義屋頂結構之平面邊界範圍,再透過最小二乘平差進行平面擬合,以取得各屋面在三維空間中的位置與分布狀態。此外,本研究藉由建立各多邊形之拓樸資訊,以提供平差計算中的附加約制條件與幾何改正條件,避免多邊形之間出現間隙、重疊等位相關係錯誤之情形,最後成功重建三維屋頂模型。

Abstract

3D building model is one of the important elements in digital city analysis. It can be widely used in many geographic activities, such as smart city, urban planning, disaster management. The quality of 3D building models is related to their structure and geometry. CityGML, an international 3D city modeling standard, formulated the scale to express the detail of 3D building models in different Level-of-Details (LODs). From the roughest to the most detailed one is named LOD-0 to LOD-4. The main goal of our study is to reconstruct the LOD-2 3D building models which are configured by detailed roof structure with vertical facades. Since the 3D roof structure is the most important part of the LOD-2 model, we will focus on how to reconstruct the complete high-accuracy and topological error-free 3D roof models. We create point clouds and Object Height Model (OHM) from Unmanned Aerial Vehicle (UAV) images and Airborne Laser Scanning (ALS), extract 2D polygons of the roof structure by manual digitization, then perform least-squares adjustment to fit the 3D roof planes. Considering the geometry and relationship between adjacent polygons, we set some constraint conditions and conduct roof plane correction to avoid topological errors. Eventually, we reconstruct some typical roof models in Taiwan, and conduct accuracy analysis by manually measuring the 3D coordinates of roof corners. It has proved that the results can conform to the LOD-2 standard of CityGML, confirming that the proposed method of 3D roof models reconstruction is feasible to varied types of roofs with high accuracy.

關鍵字

LOD-2 屋頂模型、最小二乘平差、拓樸約制條件

Keywords

LOD-2 3D Roof Model, Least-Squares Adjustment, Topological Constraints

附件檔名

華芸線上圖書館

https://www.airitilibrary.com/Publication/alDetailedMesh?DocID=10218661-202112-202201040011-202201040011-229-246

備註說明

N / A

更多活動學刊