Formation of synthetic data in machine learning models based on multiscale analysis of binary Markov models
- Autores: Pushkin P.Y.1, Konyshev M.Y.1, Perevezentsev D.S.1, Grachev A.S.1
-
Afiliações:
- MIREA – Russian Technological University
- Edição: Volume 12, Nº 5 (2025)
- Páginas: 47-55
- Seção: SYSTEM ANALYSIS, INFORMATION MANAGEMENT AND PROCESSING, STATISTICS
- URL: https://journal-vniispk.ru/2313-223X/article/view/358384
- DOI: https://doi.org/10.33693/2313-223X-2025-12-5-47-55
- EDN: https://elibrary.ru/EGDNUN
- ID: 358384
Citar
Resumo
A method for generating synthetic data for training systems in binary Markov data sources is presented, based on estimates of the elements of the transition probability matrices of binary Markov chains obtained as a result of a multiscale analysis, which differs from the known ones by taking into account the ranges of values of the matrix elements in the observed objects. An algorithm for the formation of synthetic data is proposed, which implements the calculation of elements of transition probability matrices within the estimates obtained on real data. The results of a computational experiment organized to test the quality of machine learning using the developed method and algorithm confirmed the possibility of improving the quality of artificial intelligence systems.
Texto integral
##article.viewOnOriginalSite##Sobre autores
Pavel Pushkin
MIREA – Russian Technological University
Autor responsável pela correspondência
Email: pushkin@mirea.ru
Código SPIN: 9901-4887
Cand. Sci. (Eng.), Associate Professor, Director, Institute of Advanced Technologies and Industrial Programming
Rússia, MoscowMikhail Konyshev
MIREA – Russian Technological University
Email: konyshev@mirea.ru
Código SPIN: 4213-7083
Dr. Sci. (Eng.), Associate Professor, Professor, Department KB-1 “Information Protection”, Institute of Cybersecurity and Digital Technologies
Rússia, MoscowDmitry Perevezentsev
MIREA – Russian Technological University
Email: perevezentsev@mirea.ru
senior lecturer, Basic Department BK-252, Institute of Artificial Intelligence
Rússia, MoscowAlexander Grachev
MIREA – Russian Technological University
Email: grachyov@mirea.ru
Código SPIN: 2556-2201
senior lecturer, Department KB-1, Institute of Cybersecurity and Digital Technologies
Rússia, MoscowBibliografia
- Belyaeva O.V., Perminov A.I., Kozlov I.S. Using synthetic data for fine-tuning document segmentation models. Proceedings of the Institute for System Programming of the Russian Academy of Sciences. 2020. Vol. 32. No. 4. Pp. 189–202. (In Rus.)
- Mosalov O.P. Using generative adversarial networks in the problem of predicting the existence of edges in an ontological graph. Information Technology Bulletin. 2020. No. 4 (26). Pp. 96–103. (In Rus.)
- Laptev V.V., Danilov V.V. Study of variational autoencoder for synthesis of new medical data. In: Collection of selected articles of the scientific session of TUSUR. 2020. No. 1-2. Pp. 68–70.
- Anderson T.W. The statistical analysis of time series. New York, 1971. 704 p.
- Konyshev M.Yu., Ivanov V.A., Tarakanov O.V. et al. Binary Markov chains and their application. Moscow: MIREA, 2023. 181 p.
- Konyshev M.Yu., Baranov V.A., Bliznyuk V.I. et al. Methods of analysis and synthesis of binary random sequences. Orel: Academy of the Federal Security Service of the Russian Federation, 2020. 120 p.
- Agamirov L.V., Agamirov V.L., Vestyak V.A. Calculations of inverse distribution functions: Algorithms and programs. Software Products and Systems. 2024. No. 2. Pp. 137–145. (In Rus.)
- Li Ts., Judge D., Zellner A. Estimation of Markov model parameters from aggregated time series. Moscow: Statistika, 1977. 221 p.
- Gluskin V.A., Dementyev A.N., Gondarenko E.A. et al. Estimation of error source parameters in discrete communication channels with error grouping. Dynamics of Complex Systems XXI Century. 2023. Vol. 17. No. 4. Pp. 56–69. (In Rus.)
- Feder J. Fractals. Transl. from Engl. Moscow: Mir, 1991. 254 p.
- Bliznyuk V.I., Konyshev M.Yu., Ivanov V.A., Kharchenko S.V. Method of directed enumeration of distribution series in problems of modeling Markov binary sequences. Industrial ACS and Controllers. 2015. No. 5. Pp. 40–45. (In Rus.)
- Baranov V.A., Konyshev M.Yu., Privalov A.A., Shestakov A.V. Verification of cryptographic algorithms based on the use of the method of simulating binary random sequences with given statistical properties. High-tech in Space Research of the Earth. 2019. Vol. 11. No. 6. (In Rus.)
- Ivanov V.A., Konyshev M.Yu., Markin A.V. Conceptual model of a message source at the output of a multiplexer for studying the properties of a binary stream in data compression procedures. Communication Equipment. 2022. No. 1 (157). Pp. 61–68. (In Rus.)
- Ivanov V.A., Konyshev M.Yu., Ivanov I.V. Application of traffic acceleration in multichannel radio communication networks. In: Information society technologies. Collection of Proceedings of the XVII International Industry Scientific and Technical Conference (Moscow, March 2–3, 2023). Moscow: Media Puplisher, 2023. Pp. 26–28.
Arquivos suplementares




