TY - JOUR
T1 - Toward Automatic Streetside Building Identification With an Integrated YOLO Model for Building Detection and a Vision Transformer for Identification
AU - Krawi, Ossama
AU - Rada, Lavdie
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - In an era of widespread digital imagery and advancing machine vision technologies, automated methods to precisely locate photographed buildings are crucial across various sectors. This research initiates the development of an automated Streetside Building Identification System (SBIS). Leveraging the comprehensive coverage of Google Street View images across major cities worldwide, the research integrates a YOLO model for building detection with a Vision Transformer (ViT) model for building identification, supported by Transfer Learning. This innovative approach aims to pinpoint exact building coordinates in urban environments, overcoming challenges associated with insufficient supporting data. Utilizing Google Street View datasets that cover entire urban landscapes, the proposed method offers efficiency and scalability, simplifying data acquisition and avoiding logistical complexities of manual interaction for targeted collections. Furthermore, it ensures a more inclusive representation of diverse urban environments, recognizing buildings of every shape and architectural style. The system can scan areas covered by Google Street View even without commercial or business data, navigating through limited information scenarios and marking a significant progression from previous studies. The initial system version provides insights into its implementation and discusses potential improvements. Tests against privately collected images show a current accuracy of 94.23%, offering a promising foundation for further refinements. The primary objective is to develop an automated solution capable of creating a comprehensive database for building recognition tasks, eliminating the laborious manual search process for extensive datasets.
AB - In an era of widespread digital imagery and advancing machine vision technologies, automated methods to precisely locate photographed buildings are crucial across various sectors. This research initiates the development of an automated Streetside Building Identification System (SBIS). Leveraging the comprehensive coverage of Google Street View images across major cities worldwide, the research integrates a YOLO model for building detection with a Vision Transformer (ViT) model for building identification, supported by Transfer Learning. This innovative approach aims to pinpoint exact building coordinates in urban environments, overcoming challenges associated with insufficient supporting data. Utilizing Google Street View datasets that cover entire urban landscapes, the proposed method offers efficiency and scalability, simplifying data acquisition and avoiding logistical complexities of manual interaction for targeted collections. Furthermore, it ensures a more inclusive representation of diverse urban environments, recognizing buildings of every shape and architectural style. The system can scan areas covered by Google Street View even without commercial or business data, navigating through limited information scenarios and marking a significant progression from previous studies. The initial system version provides insights into its implementation and discusses potential improvements. Tests against privately collected images show a current accuracy of 94.23%, offering a promising foundation for further refinements. The primary objective is to develop an automated solution capable of creating a comprehensive database for building recognition tasks, eliminating the laborious manual search process for extensive datasets.
KW - Building identification
KW - Vision Transformer
KW - YOLO
KW - computer vision
KW - deep learning
UR - http://www.scopus.com/inward/record.url?scp=105001811450&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2025.3552928
DO - 10.1109/ACCESS.2025.3552928
M3 - Article
AN - SCOPUS:105001811450
SN - 2169-3536
VL - 13
SP - 52901
EP - 52911
JO - IEEE Access
JF - IEEE Access
ER -