卷 21, 编号 4 (2022)

Artificial intelligence, knowledge and data engineering

Randomized Machine Learning and Forecasting of Nonlinear Dynamic Models Applied to SIR Epidemiological Model

Popkov A., Dubnov Y., Popkov Y.

摘要

We propose an approach to estimation of the parameters of non-linear dynamic models using the concept of Randomized Machine Learning (RML), based on the transition from deterministic models to random ones (with random parameters), followed by estimation of the probability distributions of parameters and noises on real data. The main feature of this method is its efficiency in conditions of a small amount of real data. The paper considers models formulated in terms of ordinary differential equations, which are converted to a discrete form for setting and solving the problem of entropy optimization. The application of the proposed approach is demonstrated on the problem of predicting the total number of infected COVID-19 using adynamic SIR epidemiological model. To do this, we construct a randomized SIR model (R-SIR) with one parameter, the entropy-optimal estimate of which is realized by its probability density function, as well as the probability density functions of the measurement noise at the points where training is performed. Next, the technique of randomized prediction with noise filtering is applied, based on the generation of the corresponding distributions and the construction of an ensemble of predictive trajectories with the calculation of the trajectory averaged over the ensemble. The paper implements a computational experiment using real operational data on the infection cases in the form of a comparative study with a well-known method for estimating model parameters based on the least squares method. The results obtained in the experiment demonstrate a significant decrease in the mean absolute percentage error (MAPE) with respect to real observations in the forecast interval, which shows the efficiency of the proposed method and its effectiveness in problems of the type considered in the work.
Informatics and Automation. 2022;21(4):659-677
pages 659-677 views

Analytical Review of Methods for Solving Data Scarcity Issues Regarding Elaboration of Automatic Speech Recognition Systems for Low-Resource Languages

Kipyatkova I., Kagirov I.

摘要

In this paper, principal methods for solving training data issues for the so-called low-resource languages are discussed, regarding elaboration of automatic speech recognition systems. The notion of low-resource languages is studied and a working definition is coined on the basis of a number of papers on this topic. The main difficulties associated with the application of classical approaches to automatic speech recognition to the material of low-resource languages are determined, and the principal methods used to solve these problems are outlined. The paper discusses the methods for data augmentation, transfer learning and collection of new language data in detail. Depending on specific tasks, methods for audio material and text data augmentation, transfer learning and multi-task learning are distinguished. In Section 4 of the paper the current information support methods, databases and the basic principles of their architecture are discussed with regard to low-resource languages. Conclusions are drawn about the justification of augmentation and knowledge transfer methods for languages with low information support. In the case of unavailability of language data or structurally similar parent models, the preferred option is to collect a new database, including the crowdsourcing technique. Multilanguage learning models are effective for small datasets. If big language data are available, the most efficient method is transfer learning within a language pair. The conclusions made in the course of this this review will be applied to the data of the low-resource Karelian language, for which an automatic speech recognition system has been being created by the authors of this paper since the beginning of the year 2022.
Informatics and Automation. 2022;21(4):678-709
pages 678-709 views

Apple Leaf Disease Classification Using Image Dataset: a Multilayer Convolutional Neural Network Approach

Mahamudul Hashan A., Md Rakib Ul Islam R., Avinash K.

摘要

Agriculture is one of the prime sources of economic growth in Russia; the global apple production in 2019 was 87 million tons. Apple leaf diseases are the main reason for annual decreases in apple production, which creates huge economic losses. Automated methods for detecting apple leaf diseases are beneficial in reducing the laborious work of monitoring apple gardens and early detection of disease symptoms. This article proposes a multilayer convolutional neural network (MCNN), which is able to classify apple leaves into one of the following categories: apple scab, black rot, and apple cedar rust diseases using a newly created dataset. In this method, we used affine transformation and perspective transformation techniques to increase the size of the dataset. After that, OpenCV crop and histogram equalization method-based preprocessing operations were used to improve the proposed image dataset. The experimental results show that the system achieves 98.40% training accuracy and 98.47% validation accuracy on the proposed image dataset with a smaller number of training parameters. The results envisage a higher classification accuracy of the proposed MCNN model when compared with the other well-known state-of-the-art approaches. This proposed model can be used to detect and classify other types of apple diseases from different image datasets.

Informatics and Automation. 2022;21(4):710-728
pages 710-728 views

Robotics, automation and control systems

Analytical Review of Approaches to the Distribution of Tasks for Mobile Robot teams Based on Soft Computing Technologies

Darintsev O., Migranov A.

摘要

The use of various types of heuristic algorithms based on soft computing technologies for the distribution of tasks in groups of mobile robots performing monosyllabic operations in a single workspace is considered: genetic algorithms, ant algorithms and artificial neural networks. It is shown that this problem is NP-complex and its solution by direct iteration for a large number of tasks is impossible. The initial problem is reduced to typical NP-complete problems: the generalized problem of finding the optimal group of closed routes from one depot and the traveling salesman problem. A description of each of the selected algorithms and a comparison of their characteristics are presented. A step-by-step algorithm of operation is given, taking into account the selected genetic operators and their parameters for a given population volume. The general structure of the developed algorithm is presented, which makes it possible to solve a multi-criteria optimization problem efficiently enough, taking into account time costs and the integral criterion of robot efficiency, taking into account energy costs, functional saturation of each agent of the group, etc. The possibility of solving the initial problem using an ant algorithm and a generalized search for the optimal group of closed routes is shown. For multi-criteria optimization, the possibility of linear convolution of the obtained vector optimality criterion is shown by introducing additional parameters characterizing group control: the overall efficiency of the functioning of all robots, the energy costs for the functioning of the support group and the energy for placing one robot on the work field. To solve the task distribution problem using the Hopfield neural network, its representation is made in the form of a graph obtained during the transition from the generalized task of finding the optimal group of closed routes from one depot to the traveling salesman problem. The quality indicator is the total path traveled by each of the robots in the group.
Informatics and Automation. 2022;21(4):729-757
pages 729-757 views

Improving the Accuracy of IP Geolocation Based on Public IP Geoservices Data

Ivanov M., Polunin A.

摘要

IP geolocation is the process of determining the real geographic location of an electronic device connected to the Internet, by its global network address [1]. Currently, it has found wide application in Internet commerce, marketing and advertising, information security [2], and other areas of human activity. There are different methods for determining the location of a remote network device, which differ both in type of analyzed information (delay packet transmission, resource records DNS-servers, the content of Web pages), and the result (country or city name, mail address, probable area of location or exact coordinates) [3, 4]. IP geolocating error depends on the country, population density, type of network device and ranges from several tens of meters to hundreds of kilometers. For the same input data, the results of different IP-geoservices can vary significantly. The object of this study is the public IP-geoservices that provide geolocating services for nodes in the global network based on their IP addresses, and specifically, their accuracy and completeness. The sample of IP-geoservices for testing was formed from the most popular ones [5]. During the study, the results of IP-geolocation were compared with reliable information about the location of some IP addresses, as indicators of accuracy country, city and geographic coordinates were used. Based on the comparative analysis of the test results, conclusions about the accuracy of IP-geolocation services according to the selected indicators, their essential properties, as well as the dependence of geolocation error on the size of the settlement were made. To improve the accuracy of IP georeferencing, the authors proposed an ensemble method for averaging coordinates obtained from several IP geoservices.
Informatics and Automation. 2022;21(4):758-785
pages 758-785 views

Method for Determining the Functional Dependences of Working Outputs of Logic Combination Schemes for Development Unidirectional Errors

Abdullaev R.

摘要

Structural dependences of the working outputs of logical combinational circuits were studied with the aim of subsequent identification of the type of possible errors. The types of manifested errors and the classification of the working outputs of logical combinational circuits are given. It is shown that the presence of an internal structural connection of discrete devices leads to an increase in the multiplicity of possible errors. The condition for determining the functional dependence of outputs on the manifestation of errors of the studied multiplicity is given. It is noted that out of the many types of errors, unidirectional errors can appear at the outputs of the circuits. A well-known method for determining unidirectionally dependent operating outputs of discrete device circuits is presented, which has a drawback. It is only necessary to pairwise compare each output with the rest of the whole set. For the convenience of the process of searching for such outputs, the author of the article proposed a new method for identifying unidirectionally dependent working outputs. This method differs from known methods in that it is applicable for any number of outputs, which requires much less time to search for the above outputs. It is shown that logical combinational circuits can have functional features, in which only unidirectional errors can appear at the working outputs. Therefore, a new method for identifying any number of unidirectionally independent operating outputs of combinational circuits has been proposed. It is shown that the methods proposed in the article for finding unidirectionally dependent and unidirectionally independent outputs of logical combinational circuits require simple mathematical calculations. In the Multisim, internal faults of the diagnosable circuits are simulated and all possible errors at the working outputs are fixed. According to the results of the experiments, the validity of the theoretical results obtained was also confirmed.
Informatics and Automation. 2022;21(4):786-811
pages 786-811 views

Method of Structural-Parametric Synthesis of Configuration Multi-Mode Object

Pavlov A., Pavlov D., Umarov A., Gordeev A.

摘要

The complexity of modern objects with a reconfigurable structure leads to the need to take into account various factors of their interaction with the environment and is associated with an increase in the number of their constituent elements and subsystems, as well as, accordingly, a rapid increase in the number of internal connections, and manifests itself in such aspects as structural complexity, complexity of functioning, complexity of choice of behavior, complexity of modeling and complexity of development. These systems operate in conditions of significant uncertainty associated with a change in the content of the goals and objectives facing the object, the impact of disturbing factors from the external environment and having a targeted and / or non-targeted character. These aspects of the complexity of the system are associated not only with the uncertain effects of the external environment, but also with many different modes (types) of functioning, corresponding to the multiplicity of tasks being solved and the multiplicity of indicators of the quality of their solution. As a rule, systems with a fixed structure, usually tuned to a steady (some given) mode, do not provide the best control quality in other modes. Therefore, the multi-mode and uncertainty of the operating conditions necessitate solving the problem of analysis and synthesis of the configuration and reconfiguration of the objects under consideration, based on intelligent approaches. At the same time, at the stages of creating and designing objects with a tunable structure, such interconnected sets of modes of operation and structures should be synthesized, and, possibly, such a level of redundancy should be introduced into these sets, taking into account space-time, technical and technological restrictions, under which at the stage of their application for the intended purpose, it would be possible to respond flexibly to all design and off-design contingencies that cause structural changes in the object.  From a formal point of view, the solution to these problems is possible within the framework of such an important class of modern scientific and technical problems as the problems of multi-criteria structural-functional synthesis of configurations of multi-mode objects at various stages of their life cycle. This article presents a method for solving these problems, based on the concept of the parametric genome of complex multi-mode objects proposed by the authors. The application of this concept makes it possible to store in a concentrated form the explicit and implicit knowledge of experts about the interaction of elements and subsystems of an object when performing various combinations of the implementation of operating modes, as well as to quickly calculate optimistic and pessimistic estimates of indicators of structural and functional reliability of homogeneous / heterogeneous, monotonous / non-monotonic, equivalent /unequal multi-mode objects. With a multi-criteria choice of the required number of non-dominated variants of configurations of a multi-mode object, evenly distributed in the set of effective (Pareto) alternatives, a combination of the method of interval lexicographic ordering (successive concessions) and an operator decision rule was proposed. At the same time, in order to conduct a detailed analysis of the possibility of implementing an object of joint or separate activation of operating modes with an equivalent or unequal intensity of their use, a fuzzy-possibility representation of a generalized indicator of structural and functional reliability in the form of a trapezoidal number and determining its center of gravity was proposed. The results of applying the developed method of structural-parametric synthesis of configurations of a multi-mode object with a tunable structure are presented on the example of the motion control system of the small spacecraft "Aist-2D".
Informatics and Automation. 2022;21(4):812-845
pages 812-845 views

Согласие на обработку персональных данных с помощью сервиса «Яндекс.Метрика»

1. Я (далее – «Пользователь» или «Субъект персональных данных»), осуществляя использование сайта https://journals.rcsi.science/ (далее – «Сайт»), подтверждая свою полную дееспособность даю согласие на обработку персональных данных с использованием средств автоматизации Оператору - федеральному государственному бюджетному учреждению «Российский центр научной информации» (РЦНИ), далее – «Оператор», расположенному по адресу: 119991, г. Москва, Ленинский просп., д.32А, со следующими условиями.

2. Категории обрабатываемых данных: файлы «cookies» (куки-файлы). Файлы «cookie» – это небольшой текстовый файл, который веб-сервер может хранить в браузере Пользователя. Данные файлы веб-сервер загружает на устройство Пользователя при посещении им Сайта. При каждом следующем посещении Пользователем Сайта «cookie» файлы отправляются на Сайт Оператора. Данные файлы позволяют Сайту распознавать устройство Пользователя. Содержимое такого файла может как относиться, так и не относиться к персональным данным, в зависимости от того, содержит ли такой файл персональные данные или содержит обезличенные технические данные.

3. Цель обработки персональных данных: анализ пользовательской активности с помощью сервиса «Яндекс.Метрика».

4. Категории субъектов персональных данных: все Пользователи Сайта, которые дали согласие на обработку файлов «cookie».

5. Способы обработки: сбор, запись, систематизация, накопление, хранение, уточнение (обновление, изменение), извлечение, использование, передача (доступ, предоставление), блокирование, удаление, уничтожение персональных данных.

6. Срок обработки и хранения: до получения от Субъекта персональных данных требования о прекращении обработки/отзыва согласия.

7. Способ отзыва: заявление об отзыве в письменном виде путём его направления на адрес электронной почты Оператора: info@rcsi.science или путем письменного обращения по юридическому адресу: 119991, г. Москва, Ленинский просп., д.32А

8. Субъект персональных данных вправе запретить своему оборудованию прием этих данных или ограничить прием этих данных. При отказе от получения таких данных или при ограничении приема данных некоторые функции Сайта могут работать некорректно. Субъект персональных данных обязуется сам настроить свое оборудование таким способом, чтобы оно обеспечивало адекватный его желаниям режим работы и уровень защиты данных файлов «cookie», Оператор не предоставляет технологических и правовых консультаций на темы подобного характера.

9. Порядок уничтожения персональных данных при достижении цели их обработки или при наступлении иных законных оснований определяется Оператором в соответствии с законодательством Российской Федерации.

10. Я согласен/согласна квалифицировать в качестве своей простой электронной подписи под настоящим Согласием и под Политикой обработки персональных данных выполнение мною следующего действия на сайте: https://journals.rcsi.science/ нажатие мною на интерфейсе с текстом: «Сайт использует сервис «Яндекс.Метрика» (который использует файлы «cookie») на элемент с текстом «Принять и продолжить».