Lightweight Multi-Object Detection for Construction Sites Based on YOLO-World
Download PDF
$currentUrl="http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]"

Keywords

Dense
LCS-YOLO
RGNet
ADown
Re-parameterized

DOI

10.26689/jera.v9i5.11992

Submitted : 2025-09-17
Accepted : 2025-10-02
Published : 2025-10-17

Abstract

Addressing the current issues in construction site detection algorithms—such as missed detections, false positives, and high model complexity—caused by occlusions and scale variations in dense environments. This paper proposes a lightweight multi-object detection model for construction sites based on YOLO-World, named the LCS-YOLO model, to achieve a balance between detection efficiency and accuracy. We propose the RGNet (Re-parameterization GhostNet) module, which integrates re-parameterized convolutions and a multi-branch architecture. This approach addresses the issue of information redundancy in intermediate feature maps while enhancing feature extraction and gradient flow capabilities. Combined with the adaptive downsampling module ADown (Adaptive Downsampling), it better captures image features and achieves spatial compression, reducing model complexity while enhancing interaction between images and text. Experiments demonstrate that the LCS-YOLO model outperforms other comparison models in overall performance, achieving a balance between accuracy and efficiency.

References

Kelm A, Laußat L, Meins-Becker A, et al., 2013, Mobile Passive Radio Frequency Identification (RFID) Portal for Automated and Rapid Control of Personal Protective Equipment (PPE) on Construction Sites. Automation in Construction, 3638–3652.

Yuan F, Lin Z, Tian Z, et al., 2025, Bio-Inspired Hybrid Path Planning for Efficient and Smooth Robotic Navigation. Int J Intell Robot Appl.

Liang B, Yuan F, Deng J, et al., 2025, Cs-pbft: A Comprehensive Scoring-Based Practical Byzantine Fault Tolerance Consensus Algorithm. J Supercomput, 81: 859.

Yuan F, Huang X, Jiang H, et al., 2025, An xLSTM–XGBoost Ensemble Model for Forecasting Non-Stationary and Highly Volatile Gasoline Price. Computers, 14: 256.

Girshick R, Donahue J, Darrell T, et al., 2014, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, 580–587.

Zhang K, Yuan F, Jiang Y, et al., 2025, A Particle Swarm Optimization-Guided Ivy Algorithm for Global Optimization Problems. Biomimetics (Basel), 10(5): 342.

Sohan M, Sai Ram T, Reddy R, et al., 2024, A Review on YOLOv8 and Its Advancements, International Conference on Data Intelligence and Cognitive Informatics, Springer, Singapore, 529–545.

Bochkovskiy A, Wang CY, Liao HYM, 2020, YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv. https://arxiv.org/abs/2004.10934

Liu S, Qi L, Qin HF, et al., 2018, Path Aggregation Network for Instance Segmentation, Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, 8759–8768.

Ding X, Zhang XY, et al., 2021, RepVGG: Making VGG-Style ConvNets Great Again, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Zhang YJ, Xiao FS, Lu ZM, 2022, Helmet Wearing State Detection Based on Improved YOLOv5s. Sensors, 22(24): 9843.

Fang S, Chen C, Li Z, et al., 2024, YOLO-ADual: A Lightweight Traffic Sign Detection Model for a Mobile Driving System. World Electric Vehicle Journal, 15(7): 323.

Cheng T, Song L, et al., 2024, Yolo-World: Real-Time Open-Vocabulary Object Detection, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16901–16911.

Ding X, Zhang X, Ma N, et al., 2021, RepVGG: Making VGG-Style ConvNets Great Again, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13733–13742.

Jiang B, Luo R, Mao J, et al., 2018, Acquisition of Localization Confidence for Accurate Object Detection, Proceedings of the European Conference on Computer Vision (ECCV), 784–799.

Wang A, Chen H, Liu LH, et al., 2024, YOLOv10: Real-Time End-to-End Object Detection, Proceedings of the 38th Annual Conference on Neural Information Processing Systems 2024, NeurIPS, Vancouver, 1–28.

Khanam R, Hussain M, 2024, YOLOv11: An Overview of the Key Architectural Enhancements. arXiv. https://arxiv.org/abs/2410.17725