Generating Realistic Images of Oil and Gas Infrastructure in Satellite Imagery Using Diffusion Models
- Autores: Lobanov V.K.1, Kondrashina M.S.1, Gadzhiev S.M.1, Sokibekov M.S.1
-
Afiliações:
- RUDN University
- Edição: Volume 26, Nº 3 (2025)
- Páginas: 266-272
- Seção: Articles
- URL: https://journal-vniispk.ru/2312-8143/article/view/350894
- DOI: https://doi.org/10.22363/2312-8143-2025-26-3-266-272
- EDN: https://elibrary.ru/YICUJW
- ID: 350894
Citar
Texto integral
Resumo
This study investigated the feasibility of applying machine learning methods, specifically generative models, for semantic editing of satellite imagery. The research focused on an architecture based on diffusion models capable of generating desirable objects directly on satellite images. However, significant shortcomings were identified in the standard model with regard to realism and relevance to the surrounding context, given the specific nature of the chosen subject area, namely the generation of realistic images of oil and gas infrastructure objects (such as pipelines). To address this limitation, fine-tuning of the neural network was performed. The objective of the fine-tuning was to enhance the quality of visualizing pipeline-related design solutions. A methodological approach for creating training dataset was proposed and described in detail. Based on actual pipeline routes, spatially referenced vector layers were created in QGIS, and a set of satellite image tiles with precise pipeline boundary annotations was generated. The results of the experimental fine-tuning demonstrated a significant improvement in the quality of generated images depicting oil and gas infrastructure objects in satellite imagery compared to the original, non-adapted model. The developed fine-tuned model enables highly realistic pipeline generation, effectively integrating them into the existing landscape within the image. Visual comparison of results before and after fine-tuning confirms the elimination of artifacts and the achievement of the required level of detail. This work demonstrates the effectiveness of the approach involving the creation of specific datasets and fine-tuning for solving specialized visualization tasks in remote sensing.
Palavras-chave
Sobre autores
Vasily Lobanov
RUDN University
Email: lobanov_vk@pfur.ru
ORCID ID: 0000-0001-8163-9663
Código SPIN: 7266-5340
Senior Lecturer of the Department of Mechanics and Control Processes, Academy of Engineering
6 Miklukho-Maklaya St, Moscow, 117198, Russian FederationMariia Kondrashina
RUDN University
Autor responsável pela correspondência
Email: 1132236536@rudn.ru
ORCID ID: 0009-0008-8526-9143
Master student of the Department of Mechanics and Control Processes, Academy of Engineering
6 Miklukho-Maklaya St, Moscow, 117198, Russian FederationShamil Gadzhiev
RUDN University
Email: 1132236511@rudn.ru
ORCID ID: 0009-0006-1570-4133
Master student of the Department of Mechanics and Control Processes, Academy of Engineering
6 Miklukho-Maklaya St, Moscow, 117198, Russian FederationMaksad Sokibekov
RUDN University
Email: 1032185455@rudn.ru
ORCID ID: 0009-0009-0261-7374
Master student of the Department of Architecture, Restoration and Design, Academy of Engineering
6 Miklukho-Maklaya St, Moscow, 117198, Russian FederationBibliografia
- Immanuel SA, Cho W, Heo J, Kwon D. Tackling Few-Shot Segmentation in Remote Sensing via Inpainting Diffusion Model. ICLR 2025 Machine Learning for Remote Sensing (ML4RS) Workshop. 2025. https://doi.org/10.48550/arXiv.2503.03785
- Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-Resolution Image Synthesis with Latent Diffusion Models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2022 June 18-24; New Orleans, LA, USA. IEEE. 2022:10674-10685. https://doi.org/10.1109/CVPR52688.2022.01042
- Panboonyuen T, Charoenphon C, Satirapod C. SatDiff: A Stable Diffusion Framework for Inpainting Very High-Resolution Satellite Imagery. IEEE Access. 2025;13:51617-51631. https://doi.org/10.1109/ACCESS.2025.3551782
- Kingma DP, Welling M. Auto-Encoding Variational Bayes (Version 11). International Conference on Learning Representations (ICLR). 2014. https://doi.org/10.48550/ARXIV.1312.6114
- Ronneberger O, Fischer P, Brox T. U-Net: Convo-lutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Inter-vention MICCAI. 2015;9351;234-241. https://doi.org/10.48550/arXiv.1505.04597
- Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, et al. Learning Transferable Visual Models from Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, PMLR. 2021;139:8748-8763. https://doi.org/10.48550/ARXIV.2103.00020
- Liu F, Chen D, Guan Z, Zhou X, Zhu J, Ye Q, et al. RemoteCLIP: A Vision Language Foundation Model for Remote Sensing. IEEE Transactions on Geoscience and Remote Sensing. 2024;62:1-16. https://doi.org/10.1109/TGRS.2024.3390838
- He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016;770-778. https://doi.org/10.48550/ARXIV.1512.03385
- Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations (ICLR 2021). https://doi.org/10.48550/ARXIV.2010.11929
- Immanuel SA, Cho W, Heo J, Kwon D. Tackling Few-Shot Segmentation in Remote Sensing via Inpainting Diffusion Model. ICLR 2025 Machine Learning for Remote Sensing (ML4RS) Workshop. 2025. https://doi.org/10.48550/arXiv.2503.03785
Arquivos suplementares


