Method for Processing Photo and Video Data from Camera Traps Using a Two-Stage Neural Network Approach
- Authors: Efremov V.A.1, Leus A.V.1, Gavrilov D.A.1, Mangazeev D.I.1, Kholodnyak I.V.1, Radysh A.S.1, Zuev V.A.1, Vodichev N.A.1
-
Affiliations:
- Moscow Institute of Physics and Technology (National Research University)
- Issue: No 3 (2023)
- Pages: 98-108
- Section: Analysis of Signals, Audio and Video Information
- URL: https://journal-vniispk.ru/2071-8594/article/view/270353
- DOI: https://doi.org/10.14357/20718594230310
- ID: 270353
Cite item
Full Text
Abstract
The paper proposes a technology for analyzing data from camera traps using two-stage neural network processing. The task of the first stage is to separate empty images from non-empty ones. To solve the problem, a comparative analysis of the YOLOv5, YOLOR, YOLOX architectures was carried out and the most optimal detector model was identified. The task of the second stage is to classify the objects found by the detector. Models such as EfficientNetV2, SeResNet, ResNeSt, ReXNet, ResNet were compared. To train the detector model and the classifier, a data preparation approach was developed, which consists in removing duplicate images from the sample. The method was modified using agglomerative clustering to divide the sample into training, validation, and test. In the task of object detection, the YOLOv5-L6 algorithm was the best with an accuracy of 98.5% on the data set. In the task of classifying the found objects, the ResNeSt-101 architecture was the best of all with a recognition quality of 98.339% on test data.
Full Text

About the authors
Vladislav A. Efremov
Moscow Institute of Physics and Technology (National Research University)
Author for correspondence.
Email: efremov.va@phystech.edu
Postgraduate student, programmer of the Laboratory of Digital Systems for Special Purposes
Russian Federation, Dolgoprudny, Moscow RegionAndrey V. Leus
Moscow Institute of Physics and Technology (National Research University)
Email: leus.av@mipt.ru
Candidate of Technical Sciences, Leading Programmer of the Laboratory of Digital Systems for Special Purposes
Russian Federation, Dolgoprudny, Moscow RegionDmitry A. Gavrilov
Moscow Institute of Physics and Technology (National Research University)
Email: gavrilov.da@mipt.ru
Doctor of Technical Sciences, Director of the Physical and Technical School of the FRCT
Russian Federation, Dolgoprudny, Moscow RegionDaniil I. Mangazeev
Moscow Institute of Physics and Technology (National Research University)
Email: mangazeev.di@phystech.edu
Master, programmer of the Laboratory of Digital Systems for Special Purposes
Russian Federation, Dolgoprudny, Moscow RegionIvan V. Kholodnyak
Moscow Institute of Physics and Technology (National Research University)
Email: kholodnyak.iv@phystech.edu
Master
Russian Federation, Dolgoprudny, Moscow RegionAlexandra S. Radysh
Moscow Institute of Physics and Technology (National Research University)
Email: radysh.as@phystech.edu
Master
Dolgoprudny, Moscow RegionViktor A. Zuev
Moscow Institute of Physics and Technology (National Research University)
Email: zuev.va@phystech.edu
Master
Russian Federation, Dolgoprudny, Moscow RegionNikita A. Vodichev
Moscow Institute of Physics and Technology (National Research University)
Email: vodichev.na@phystech.edu
Master
Russian Federation, Dolgoprudny, Moscow RegionReferences
- O’Connell A. F., Nichols J. D., Karanth K.U. Camera traps in animal ecology: Methods and analyses. – Berlin, Germany: Springer Science & Business Media. 2011. 279 р.
- He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. P. 770–778.
- Gavrilov D.A., Lovtsov D.A. Automated processing of visual information using artificial intelligence technologies // Artificial intelligence and decision making. 2020. No. 4. Р. 33 - 46.
- Lovtsov D. A., Gavrilov D.A. Automated special purpose optical electronic system’s functional diagnosis // Proc. Int. Semin. Electron Devices Des. Prod. SED-2019 (23 – 24 April 2019). – Prague, Czech Repub. IEEE, 2019. P. 70 – 73.
- Yu X., Wang J., Kays R., Jansen P. A., Wang T. H.T. Automated identification of animal species in camera trap images// EURASIP Journal on Image and Video Processing. 2013. Vol. №. 1. P. 52.
- Chen G., Han T. X, He Z., Kays R., Forrester T. Deep convolutional neural network based species recognition for wild animal monitoring // IEEE international conference on image processing (ICIP). 2014. P. 858–862.
- Gomez-Villa A., Salazar A., Vargas F. Towards automatic wild animal monitoring: Identification of animal species in camera-trap images using very deep convolutional neural networks.// Ecological Informatics. 2017. № 41. Р. 24–32.
- Nguyen H., Maclagan S. J., Nguyen T. D., Nguyen T., Flemons P., Andrews K., Ritchie E. G., Phung D. Animal recognition and identification with deep convolutional neural networks for automated wildlife monitoring. //International Conference on Data Science and Advanced Analytics, DSAA. – 2017: Tokyo, Japan 19-21 October 2017. Р. 40-49.
- Beery S., Van Horn, G., Perona P. Recognition in Terra Incognita// Computer Vision. ECCV 2018. Lecture Notes in Computer Science. Vol. 11220.
- Norouzzadeh M. S., Morris D., Beery S., Joshi N., Jojic N., Clune J. A deep active learning system for species identification and counting in camera trap images. //Methods in Ecology and Evolution. 2021. Vol. 12 (1). Р. 150–161.
- Whytock R. C, Świeżewski J., Zwerts J.A. Robust ecological analysis of camera trap data labelled by a machine learning model// Methods in Ecology and Evolution. 2021. №12 (6). Р. 1080 –1092.
- Leus A.V., Efremov V.A. Computer vision methods application for camera traps image analysis within the software for the reserves environmental state monitoring//Proceedings of the Mordovia State Nature Reserve. 2021. Vol. 28. Р.121-129.
- Tabak M. A., Norouzzade, M. S., Wolfson D. W., Sweeney S. J., VerCauteren K. C., Snow N. P., Halseth J. M., Di Salvo P. A., Lewis J. S., White M. D., Teton B., Beasley J. C., Schlichting P. E., Boughton R. K., Wight B., Newkirk E. S., Ivan R.S. Machine learning to classify animal species in camera trap images: Applications in ecology. //Methods in Ecology and Evolution. 2018. №10 (4). Р.585–590.
- Glenn J. YOLOv5 release v6.1 – 2021 – https://github.com/ultralytics/yolov5/releases/tag/v6.1.
- Wang C., Yeh I., Liao H.M. You Only Learn One Representation: Unified Network for Multiple Tasks. 2021.
- Ge Z., Liu S., Wang F. Li Z., Sun J. YOLOX: Exceeding YOLO Series in 2021. 2021.
- Lin T. Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Zitnick C.L. Microsoft COCO: Common objects in context // European conference on computer vision. 2014. Р. 740–755.
- Hu J., Shen L., Albanie S., Sun G., Wu E. Squeeze-and-Excitation Networks. // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020. Vol. 42. № 8. P. 2011-2023.
- Zhang H., Wu C., Zhang Z., Zhu Y., Zhang Z., Lin H., Sun Y., He T., Mueller J., Manmatha R., Li M., Smola A. ResNeSt: Split-Attention Networks. 2020.
- Han D., Yun S., Heo B., Yoo Y.J. ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network. 2020.
- Tan M., Le Q. V. EfficientNetV2: Smaller Models and Faster Training. 2021.
- Tan M., Le Q. V. EfficientNet: Rethinking model scaling for convolutional neural networks. 2020.
- Sibson R. SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method//Comput. J. 1973. №16. Р. 30-34.
Supplementary files
