Adaptive Regression Model Construction Based on the Functional Quality Analysis of the Sequence Segment Processing

I. S Lebedev; Лебедев И. С

doi:10.15622/ia.24.2.1

Adaptive Regression Model Construction Based on the Functional Quality Analysis of the Sequence Segment Processing

Authors: Lebedev I.S¹
Affiliations:
1. St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)
Issue: Vol 24, No 2 (2025)
Pages: 363-394
Section: Mathematical modeling and applied mathematics
URL: https://journal-vniispk.ru/2713-3192/article/view/289691
DOI: https://doi.org/10.15622/ia.24.2.1
ID: 289691

Cite item

Full Text

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

The article considers the problem of constructing an adaptive model aimed at improving the quality indicators of processing information sequences. In data processing techniques that have found application in many application areas, the applied analysis of observation objects is computationally resource-intensive and requires many iterations in case of changes in data properties. The article proposes a technique for selecting segments of an information sequence obtained in different ways, which differs in the use of the quality functional of regression models for processing subsequences. The sequences of observation objects received at the input of the model are divided by various specified segmentation algorithms. Pre-selected regression models are trained on each obtained segment and, depending on the obtained values of the calculated quality functional, the best models in terms of quality indicators are assigned to the segments. This allows us to form an aggregation model for data processing. Based on the experiment on model data and samples, the proposed technique is assessed. The values of the quality indicator MSE and MAE are obtained for different processing algorithms and with a different number of segments. The proposed method makes it possible to increase the MSE and MAE indicators by segmentation and assignment of regression models that have the best indicators on individual segments. The proposed solution is aimed at further improvement of ensemble methods. Its application allows to increase the efficiency of setting up basic algorithms in case of data property transformation and to improve the interpretability of results. The method can be used in developing models and methods for processing information sequences.

Keywords

machine learning, adaptive models, improving the quality of processing, regression models

About the authors

I. S Lebedev

St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)

Email: isl_box@mail.ru
14-th Line V.O. 39

References

Chen H.Y., Chen C. Evaluation of Calibration Equations by Using Regression Analysis: An Example of Chemical Analysis // Sensors. 2022. vol. 22. no. 2. doi: 10.3390/s22020447.
Schober P., Vetter T.R. Segmented Regression in an Interrupted Time Series Study Design // Anesthesia and Analgesia. 2021. vol. 132. no. 3. pp. 696–697.
Bozpolat E. Investigation of the self-regulated learning strategies of students from the faculty of education using ordinal logistic regression analysis // Educational Sciences: Theory & Practice. 2016. no. 16(1). pp. 301–318.
Jarantow S.W., Pisors E.D., Chiu M.L. Introduction to the use of Linear and Nonlinear Regression Analysis in Quantitative Biological Assays // Current Protocols. 2023. no. 3. doi: 10.1002/cpz1.801.
Britzger D. The Linear Template Fit // The European Physical Journal C. 2022. vol. 82(8). doi: 10.1140/epjc/s10052-022-10581-w.
Perperoglou A., Sauerbrei W., Abrahamowicz M., Schmid M. A review of spline function procedures in R // BMC Medical Research Methodology. 2019. vol. 19. pp. 1–16.
Ren J., Tapert S., Fan C.C., Thompson W.K. A semi-parametric Bayesian model for semi-continuous longitudinal data // Statistics in Medicine. 2022. vol. 41. no. 13. pp. 2354–2374.
Taye M.M. Theoretical Understanding of Convolutional Neural Network: Concepts, Architectures, Applications, Future Directions // Computation. 2023. vol. 11. no. 3. doi: 10.3390/computation11030052.
Колмогоров А.Н. О представлении непрерывных функций нескольких переменных в виде суперпозиции непрерывных функций одного переменного // Доклады АН СССР. 1957. Т. 114. № 5. С. 953–956.
Girosi F., Poggio T. Representation Properties of Networks: Kolmogorov’s Theorem is Irrelevant. Neural Computation. 1989. vol. 4. no. 1. pp. 465–469.
Parhi R., Nowak R.D. Banach Space Representer Theorems for Neural Networks and Ridge Splines // Journal of Machine Learning Research. 2021. vol. 22(1). pp. 1960–1999.
Marques H.O., Swersky L., Sander J., Campello R.J., Zimek A. On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles // Data Mining and Knowledge Discovery. 2023. vol. 37. no. 4. pp. 1473–1517.
Li Y., Guo X., Lin W., Zhong M., Li Q., Liu Z., Zhong W., Zhu Z. Learning dynamic user interest sequence in knowledge graphs for click-through rate prediction // IEEE Transactions on Knowledge and Data Engineering. 2023. vol. 35. no. 1. pp. 647–657.
Rinaldo A., Wang D., Wen Q., Willett R., Yu Y. Localizing changes in highdimensional regression models // The 24th International Conference on Artificial Intelligence and Statistics. 2021. pp. 2089–2097.
Aue A., Rice G., Sönmez O. Detecting and dating structural breaks in functional data without dimension reduction // Journal of the Royal Statistical Society. Series B, Statistical Methodology. 2018. vol. 80. no. 3. pp. 509–529.
Datta A., Zou H., Banerjee S. Bayesian high-dimensional regression for change point analysis // Statistics and its Interface. 2019. vol. 12. no. 2. pp. 253–264. doi: 10.4310/SII.2019.v12.n2.a6.
Melnyk I., Banerjee A. A spectral algorithm for inference in hidden semi-Markov models // Journal of Machine Learning Research. 2017. vol. 18. no. 35. pp. 1–39.
Haynes K., Fearnhead P., Eckley I.A. A computationally efficient nonparametric approach for changepoint detection // Statistics and Computing. 2017. vol. 27. pp. 1293–1305. doi: 10.1007/s11222-016-9687-5.
Muggeo V. Estimating regression models with unknown break-points // Statistics in Medicine. 2003. vol. 22(19). pp. 3055–3071.
Lu K.P., Chang S.T. A fuzzy classification approach to piecewise regression models // Applied Soft Computing Journal. 2018. vol. 69. pp. 671–688.
Bardwell L., Fearnhead P. Bayesian detection of abnormal segments in multiple time series // Bayesian Analysis. 2017. vol. 12. no. 1. pp. 193–218.
Huang J., Chen P., Lu L., Deng Y., Zou Q. WCDForest: a weighted cascade deep forest model toward the classification tasks // Applied Intelligence, 2023. vol. 53. no. 23. pp. 29169–29182. doi: 10.1007/s10489-023-04794-z.
Tong W., Wang Y., Liu D. An Adaptive Clustering Algorithm Based on Local-Density Peaks for Imbalanced Data Without Parameters // IEEE Transactions on Knowledge and Data Engineering. 2023. vol. 35. no. 4. pp. 3419–3432.
Lu K.P., Chang S.T. Fuzzy maximum likelihood change-point algorithms for identifying the time of shifts in process data // Neural Computing and Applications. 2019. vol. 31. pp. 2431–2446.
Nevendra M., Singh P. Software defect prediction using deep learning // Acta Polytechnica Hungarica. 2021. vol. 18. no. 10. pp. 173–189.
Tallman E., West M. Bayesian predictive decision synthesis // Journal of the Royal Statistical Society. Series B: Statistical Methodology. 2024. vol. 86. no. 2. pp. 340–363.
Korkas K., Fryzlewicz P. Multiple change-point detection for non-stationary time series using wild binary segmentation. Statistica Sinica. 2017. vol. 27. pp. 287–311. doi: 10.5705/ss.202015.0262.
Silva R.P., Zarpelão B.B., Cano A., Junior S.B. Time Series Segmentation Based on Stationarity Analysis to Improve New Samples Prediction // Sensors. 2021. vol. 21(21). doi: 10.3390/s21217333.
Barzegar V., Laflamme S., Hu C., Dodson J. Multi-Time Resolution Ensemble LSTMs for Enhanced Feature Extraction in High-Rate Time Series // Sensors. 2021. vol. 21(6). doi: 10.3390/s21061954.
Si S., Zhao J., Cai Z., Dui H. Recent advances in system reliability optimization driven by importance measures // Frontiers of Engineering Management. 2020. vol. 7. no. 3. pp. 335–358.
Xu S., Song Y., Hao X. A Comparative Study of Shallow Machine Learning Models and Deep Learning Models for Landslide Susceptibility Assessment Based on Imbalanced Data // Forests. 2022. vol. 13. no. 11. doi: 10.3390/f13111908.
Лебедев И.С. Адаптивное применение моделей машинного обучения на отдельных сегментах выборки в задачах регрессии и классификации // Информационно-управляющие системы. 2022. № 3(118). C. 20–30.
Тихонов Д.Д., Лебедев И.С. Метод формирования сегментов информационной последовательности с использованием функционала качества моделей обработки // Научно-технический вестник информационных технологий, механики и оптики. 2024. Т. 24. № 3. С. 474–482.
Lebedev I.S., Sukhoparov M.E. Adaptive Learning and Integrated Use of Information Flow Forecasting Methods // Emerging Science Journal. 2023. vol. 7. no. 3. pp. 704–723.
Osipov V., Nikiforov V., Zhukova N., Miloserdov D. Urban traffic flows forecasting by recurrent neural networks with spiral structures of layers // Neural Computing and Applications. 2020. vol. 32. no. 18. pp. 14885–14897.
Lebedev I.S., Sukhoparov M.E. Improving the Quality Indicators of Multilevel Data Sampling Processing Models Based on Unsupervised Clustering // Emerging Science Journal. 2024. vol. 8. no. 1. pp. 355–371.
Jin H., Yin G., Yuan B., Jiang F. Bayesian hierarchical model for change point detection in multivariate sequences // Technometrics. 2022. vol. 64. no. 2. pp. 177–186.
Power Supply dataset. URL: http://www.cse.fau.edu/~xqzhu/stream.html (дата обращения: 16.05.2024).
Lu K.-P., Chang S.-T. An Advanced Segmentation Approach to Piecewise Regression Models // Mathematics. 2023. vol. 11(24). doi: 10.3390/math11244959.
Energy generation dataset. URL: https://www.kaggle.com/nicholasjhana/energy-consumption-generation-prices-and-weather/data?select=energy_dataset.csv (дата обращения: 16.05.2024).
Pima Indians Diabetes Database URL: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database (дата обращения: 16.05.2024).
E-Commerce Data URL: https://www.kaggle.com/datasets/carrie1/ecommerce-data (дата обращения: 16.05.2024).

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register

Vol 24, No 5 (2025)

Vol 24, No 5 (2025)

Adaptive Regression Model Construction Based on the Functional Quality Analysis of the Sequence Segment Processing

Full Text

Abstract

Keywords

About the authors

I. S Lebedev

References

Supplementary files