Analysis of modern SOTA-architectures of artificial neural networks for solving problems of image classification and object detection
- Authors: Korchagin V.D.1
-
Affiliations:
- Issue: No 4 (2023)
- Pages: 73-87
- Section: Articles
- URL: https://journal-vniispk.ru/2454-0714/article/view/359432
- DOI: https://doi.org/10.7256/2454-0714.2023.4.69306
- EDN: https://elibrary.ru/MZLZMK
- ID: 359432
Cite item
Full Text
Abstract
About the authors
Valeriy Dmitrievich Korchagin
Email: valerak249@gmail.com
ORCID iD: 0009-0003-1773-0085
References
Gomolka Z., Using artificial neural networks to solve the problem represented by BOD and DO indicators //Water. – 2017. – Т. 10. – №. 1. – С. 4. Кадурин А., Николенко С., Архангельская Е. Глубокое обучение. Погружение в мир нейронных сетей //СПб.: Питер. – 2018. – Т. 480. Джабраилов Шабан Вагиф Оглы, Розалиев Владимир Леонидович, Орлова Юлия Александровна Подходы и реализации компьютерной имитации интуиции // Вестник евразийской науки. 2017. №2 (39). Бабушкина, Н. Е. Выбор функции активации нейронной сети в зависимости от условий задачи / Н. Е. Бабушкина, А. А. Рачев // Инновационные технологии в машиностроении, образовании и экономике. – 2020. – Т. 27, № 2(16). – С. 12-15. Соснин А. С., Суслова И. А. Функции активации нейросети: сигмоида, линейная, ступенчатая, relu, tahn. – 2019. – С. 237. Бредихин Арсентий Игоревич Алгоритмы обучения сверточных нейронных сетей // Вестник ЮГУ. 2019. №1 (52). Hu J., Shen L., Sun G. Squeeze-and-excitation networks //Proceedings of the IEEE conference on computer vision and pattern recognition. – 2018. – С. 7132-7141. Gastaldi X. Shake-shake regularization //arXiv preprint arXiv:1705.07485. – 2017. DeVries T., Taylor G. W. Improved regularization of convolutional neural networks with cutout // arXiv preprint arXiv:1708.04552. – 2017. He K. et al. Deep residual learning for image recognition //Proceedings of the IEEE conference on computer vision and pattern recognition. – 2016. – С. 770-778. Tan M., Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks //International conference on machine learning. – PMLR, 2019. – С. 6105-6114. Tan M. et al. Mnasnet: Platform-aware neural architecture search for mobile //Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. – 2019. – С. 2820-2828. Dosovitskiy A. et al. An image is worth 16x16 words: Transformers for image recognition at scale //arXiv preprint arXiv:2010.11929. – 2020. Vaswani A. et al. Attention is all you need //Advances in neural information processing systems. – 2017. – Т. 30. Liu Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows // Proceedings of the IEEE/CVF international conference on computer vision. – 2021. – С. 10012-10022. Liu Z. et al. Swin transformer v2: Scaling up capacity and resolution //Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. – 2022. – С. 12009-12019. Dai Z. et al. Coatnet: Marrying convolution and attention for all data sizes //Advances in neural information processing systems. – 2021. – Т. 34. – С. 3965-3977. Zhai X. et al. Scaling vision transformers //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. – 2022. – С. 12104-12113. Huang Y. et al. Gpipe: Efficient training of giant neural networks using pipeline parallelism //Advances in neural information processing systems. – 2019. – Т. 32. Методы аугментации обучающих выборок в задачах классификации изображений / С. О. Емельянов, А. А. Иванова, Е. А. Швец, Д. П. Николаев // Сенсорные системы. – 2018. – Т. 32, № 3. – С. 236-245. – doi: 10.1134/S0235009218030058. Cubuk E. D. et al. Autoaugment: Learning augmentation strategies from data //Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. – 2019. – С. 113-123. Han D., Kim J., Kim J. Deep pyramidal residual networks //Proceedings of the IEEE conference on computer vision and pattern recognition. – 2017. – С. 5927-5935. Yamada Y. et al. Shakedrop regularization for deep residual learning //IEEE Access. – 2019. – Т. 7. – С. 186126-186136. Kolesnikov A. et al. Big transfer (bit): General visual representation learning //Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16. – Springer International Publishing, 2020. – С. 491-507. Foret P. et al. Sharpness-aware minimization for efficiently improving generalization //arXiv preprint arXiv:2010.01412. – 2020. Pham H. et al. Meta pseudo labels //Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. – 2021. – С. 11557-11568. Yu J. et al. Coca: Contrastive captioners are image-text foundation models //arXiv preprint arXiv:2205.01917. – 2022. Chen X. et al. Symbolic discovery of optimization algorithms //arXiv preprint arXiv:2302.06675. – 2023. Zhang H. et al. Dino: Detr with improved denoising anchor boxes for end-to-end object detection //arXiv preprint arXiv:2203.03605. – 2022. Yang J. et al. Focal modulation networks //Advances in Neural Information Processing Systems. – 2022. – Т. 35. – С. 4203-4217. Wang L. et al. Sample-efficient neural architecture search by learning actions for monte carlo tree search //IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2021. – Т. 44. – №. 9. – С. 5503-5515. Wang W. et al. Internimage: Exploring large-scale vision foundation models with deformable convolutions //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. – 2023. – С. 14408-14419. Zong Z., Song G., Liu Y. Detrs with collaborative hybrid assignments training //Proceedings of the IEEE/CVF international conference on computer vision. – 2023. – С. 6748-6758.
Supplementary files

