Iterative convex estimation of linear regression models under data stochastic heterogeneity
- Authors: Golovanov O.A.1, Tyrsin A.N.2,1
-
Affiliations:
- Institute of Economics, Ural Branch of RAS
- Ural Federal University named after the first President of Russia B. N. Yeltsin
- Issue: Vol 29, No 2 (2025)
- Pages: 294-318
- Section: Mathematical Modeling, Numerical Methods and Software Complexes
- URL: https://journal-vniispk.ru/1991-8615/article/view/349672
- DOI: https://doi.org/10.14498/vsgtu2138
- EDN: https://elibrary.ru/FUVADI
- ID: 349672
Cite item
Full Text
Abstract
One of the key challenges in linear regression analysis is ensuring robust parameter estimation under stochastic data heterogeneity. In such cases, classical least squares estimates lose their stability. This problem becomes particularly acute with error distributions having heavier tails than normal distribution. Among various approaches to enhance regression robustness, replacing quadratic loss functions with convex-concave ones has been considered, though direct application leads to multimodal objective functions, significantly complicating the optimization problem.
This study aims to analyze properties of variationally-weighted quadratic and absolute approximations for non-convex loss functions. We propose an approach based on replacing the original non-convex regression problem with iterative application of weighted least squares and least absolute deviations methods, effectively implementing variationally-weighted approximations for non-convex loss functions. Each iteration of the weighted least absolute deviations method employed descent algorithms along nodal lines.
Through Monte Carlo simulations with various loss functions, we demonstrate that the weighted least absolute deviations method outperforms least squares in computational efficiency while maintaining comparable estimation accuracy. When multiple regression assumptions are violated simultaneously, either the weighted least absolute deviations method or the generalized least absolute deviations method (implemented as a generalized descent algorithm) proves preferable for achieving acceptable accuracy. We provide computational complexity estimates and execution time analyses depending on sample size and number of regression parameters.
Full Text
##article.viewOnOriginalSite##About the authors
Oleg A. Golovanov
Institute of Economics, Ural Branch of RAS
Email: golovanov.oa@uiec.ru
ORCID iD: 0000-0002-9977-6954
SPIN-code: 4130-8355
Scopus Author ID: 58522704600
https://www.mathnet.ru/rus/person206252
Junior Researcher, Center for Economic Security
Russian Federation, 620014, Yekaterinburg, Moskovskaya str., 29Alexander N. Tyrsin
Ural Federal University named after the first President of Russia B. N. Yeltsin; Institute of Economics, Ural Branch of RAS
Author for correspondence.
Email: at2001@yandex.ru
ORCID iD: 0000-0002-2660-1221
SPIN-code: 1408-1093
Scopus Author ID: 8503427500
ResearcherId: T-5975-2017
https://www.mathnet.ru/rus/person29355
Dr. Tech. Sci., Professor; Leading Researcher; Center for Economic Security; Head of Department; Dept. of Applied Mathematics and Mechanics
Russian Federation, 620002, Yekaterinburg, Mira str., 19; 620014, Yekaterinburg, Moskovskaya str., 29References
- Hoffmann J. P. Linear Regression Models. Applications in R. New York, CRC Press, 2022, xv+420 pp. DOI: https://doi.org/10.1201/9781003162230.
- Orlov A. I. Diversity of the models for regression analysis (generalizing article), Industrial Laboratory. Materials Diagnostics, 2018, vol. 84, no. 5, pp. 63–73 (In Russian). EDN: XQBSKD. DOI: https://doi.org/10.26896/1028-6861-2018-84-5-63-73.
- Greene W. H. Econometric Analysis. New York, Pearson, 2020, 1176 pp.
- Aivazian S. A., Eniukov I. S., Meshalkin L. D. Prikladnaia statistika: Issledovanie zavisimostei [Applied Statistics: Study of Dependencies]. Moscow, Finance and Statistics, 1985, 488 pp. (In Russian)
- Clarke B. Robustness Theory and Application, Wiley Series in Probability and Statistics. Hoboken, NJ, John Wiley & Sons, 2018, xxiii+215 pp. DOI: https://doi.org/10.1002/9781118669471.
- Orlov A. I. On the requirements for statistical methods of data analysis (generalizing article), Industrial Laboratory. Materials Diagnostics, 2023, vol. 89, no. 11, pp. 98–106 (In Russian). EDN: VEWJXD. DOI: https://doi.org/10.26896/1028-6861-2023-89-11-98-106.
- Salls D., Torres J. R., Varghese A. C., et al. Statistical characterization of random errors present in synchrophasor measurements, In: 2021 IEEE Power & Energy Society General Meeting (PESGM). Washington, DC, 2021, pp. 1–5. DOI: https://doi.org/10.1109/PESGM46819.2021.9638135.
- Ives A. R. Random Errors are Neither: On the Interpretation of Correlated Data, Methods in Ecology and Evolution, 2022, vol. 13, no. 10, pp. 2092–2105. DOI: https://doi.org/10.1111/2041-210X.13971.
- Boldin M. V., Simonova G. I., Tyurin Yu. N. Znakovyi statisticheskii analiz lineinykh modelei [Sign-Based Statistical Analysis of Linear Models]. Moscow, Nauka, 1997, 288 pp. (In Russian)
- Anandhi P., Prabhu S. M. The robust regression estimators: Performance & evaluation, Int. J. Stat. Appl. Math., 2023, vol. 8, no. 6, pp. 83–87. DOI: https://doi.org/10.22271/maths.2023.v8.i6a.1444.
- Kolobov A. B. Vibrodiagnostika: teoriia i praktika [Vibrodiagnostics: Theory and Practice]. Moscow, Infra-Inzheneriia, 2019, 252 pp.
- Dubrovskaya Yu. V. Analysis of heterogeneity of economic development of territories in the conditions of digitalization, Vestn. Omsk. Univ. Ser. Ekonomika, 2020, vol. 18, no. 2, pp. 102–113 (In Russian). EDN: QWJRTP. DOI: https://doi.org/10.24147/1812-3988.2020.18(2).102-113.
- Bhatia S., Frangioni J. V., Hoffman R. M., et al. The challenges posed by cancer heterogeneity, Nature Biotechnology, 2012, vol. 30, no. 7, pp. 604–610. DOI: https://doi.org/10.1038/nbt.2294.
- Wan J.-Z., Wang C.-J., Marquet P. A. Environmental heterogeneity as a driver of terrestrial biodiversity on a global scale, Progr. Phys. Geogr., 2023, vol. 47, no. 6, pp. 912–930. DOI: https://doi.org/10.1177/03091333231189045.
- Atkinson A. C., Riani M., Torti F. Robust methods for heteroskedastic regression, Comput. Stat. Data Anal., 2016, vol. 104, pp. 209–222. DOI: https://doi.org/10.1016/j.csda.2016.07.002.
- Mudrov V. I., Kushko V. L. Metody obrabotki izmereniy. Kvazipravdopodobnyye otsenki [Methods of Measurement Processing. Quasi-Likelihood Estimates]. Moscow, Radio i svyaz, 1983, 304 pp. (In Russian)
- Dodge Y. The Concise Encyclopedia of Statistics. New York, NY, Springer, 2008, ix+616 pp. DOI: https://doi.org/10.1007/978-0-387-32833-1.
- Akimov P. A., Matasov A. I. An iterative algorithm for $l_1$-norm approximation in dynamic estimation problems, Autom. Remote Control, 2015, vol. 76, no. 5, pp. 733–748. EDN: UFVCWT. DOI: https://doi.org/10.1134/S000511791505001X.
- Tyrsin A. N. Algorithms for descent along nodal straight lines in the problem of estimating regression equations using the least absolute deviations method, Industrial Laboratory. Materials Diagnostics, 2021, vol. 87, no. 5, pp. 68–75 (In Russian). EDN: OFEXNK. DOI: https://doi.org/10.26896/1028-6861-2021-87-5-68-75.
- Golovanov O. A.,Tyrsin A. N. Modified gradient descent algorithm along nodal straight lines in regression analysis problem, Industrial Laboratory. Materials Diagnostics, 2025, vol. 91, no. 3, pp. 83–92 (In Russian). EDN: RLOBGS. DOI: https://doi.org/10.26896/1028-6861-2025-91-3-83-92.
- Tyrsin A. N., Sokolov L. A. Linear regression estimation using generalized least absolute deviations, Vestn. Samar. Gos. Tekhn. Univ., Ser. Fiz.-Mat. Nauki [J. Samara State Tech. Univ., Ser. Phys. Math. Sci.], 2010, no. 5, pp. 134–142 (In Russian). EDN: NCTNLB. DOI: https://doi.org/10.14498/vsgtu797.
- Cohen A., Migliorati G. Optimal weighted least-squares methods, SMAI J. Comput. Math., 2017, vol. 3, pp. 181–203. DOI: https://doi.org/10.5802/smai-jcm.25.
- Panyukov A. V., Tyrsin A. N. Interrelation of weighted and generalised variants of the least absolute deviations method, Izv. Cheliab. Nauchn. Tsentra, 2007, no. 1, pp. 6–11 (In Russian). EDN: IBMJQX.
- Panyukov A. V. Stable parameter estimation of autoregressive models based on generalized method of least modules, Vestnik NSUEM, 2015, no. 4, pp. 339–346 (In Russian). EDN: VFZLFR.
- Weiszfeld E., Plastria F. On the point for which the sum of the distances to $n$ given points is minimum, Ann. Oper. Res., 2009, vol. 167, no. 1, pp. 7–41. DOI: https://doi.org/10.1007/s10479-008-0352-z.
- Tyrsin A. N., Azaryan A. A. Exact evaluation of linear regression models by the least absolute deviations method based on the descent through the nodal straight lines, Vestn. Yuzhno-Ural. Gos. Un-ta. Ser. Matem. Mekh. Fiz., 2018, vol. 10, no. 2, pp. 47–56 (In Russian). EDN: YXCEWU. DOI: https://doi.org/10.14529/mmph180205.
- Golovanov O. A., Tyrsin A. N. Increasing the efficiency of the generalized least absolute deviations algorithm by refining the solution domain, In: Modern methods of boundary value theory, Proc. of the XXXIII Intern. Pontryagin Conf. (Voronezh, 3–9 May 2023). Voronezh, Voronezh State Univ., 2023, pp. 115–117 (In Russian). EDN: DHJTTI.
- Barbu A., Zhu S.-C. Introduction to Monte Carlo methods, In: Monte Carlo Methods. Singapore, Springer, 2020, pp. 1–17. DOI: https://doi.org/10.1007/978-981-13-2971-5_1.317
- Tukey J. W. A survey of sampling from contaminated distributions, In: Contributions to Probability and Statistics. Redwood, CA, Stanford Univ. Press, 1960, pp. 443–485.
- Huber P. J., Ronchetti E. M. Robust Statistics, Wiley Series in Probability and Statistics. Hoboken, NJ, John Wiley & Sons, 2009, xvi+354 pp. DOI: https://doi.org/10.1002/9780470434697.
- Azaryan A. A. Fast algorithms for modeling multivariate linear regression dependencies based on least absolute deviations method, Candidate dissertation in Physical and Mathematical Sciences (Specialty: 05.13.18 — Mathematical Modeling, Numerical Methods and Software Complexes). Ekaterinburg, Ural Federal Univ., 148 pp. (In Russian). EDN: LFRCIU.
- Tyrsin A. N., Golovanov O. A. Systems monitoring based on robust estimation of stochastic time series models, J. Phys.: Conf. Ser., 2022, vol. 2388, no. 1, 012074. EDN: JCWPQA. DOI: https://doi.org/10.1088/1742-6596/2388/1/012074.
- Gayomey J. High frequency volatility estimation and option pricing, Vestn. Altaisk. Akad. Ekonomiki Prava, 2022, no. 4-2, pp. 167–176 (In Russian). EDN: BHQLDR. DOI: https://doi.org/10.17513/vaael.2153.
- Golovanov O. A., Tyrsin A. N., Vasilyeva E. V. Assessing the impact of the COVID-19 pandemic on the trends in socio-economic development of an industrial region in Russia, J. Appl. Economic Res., 2022, vol. 21, no. 2, pp. 257–281 (In Russian). EDN: EMXLYU. DOI: https://doi.org/10.15826/vestnik.2022.21.2.010.
- Kiryanov B. F., Tokmachev M. S. Matematicheskie modeli v zdravookhranenii [Mathematical Models in Healthcare]. Veliky Novgorod, Yaroslav-the-Wise Novgorod State Univ., 2009, 279 pp. (In Russian). EDN: QLWOYH.
- Sobolev G. A., Zakrzhevskaya N. A., Migunov I. N. Effect of meteorological conditions on tectonic deformations in hourly period range, Izv., Phys. Solid Earth, 2021, vol. 57, no. 6, pp. 834–848. EDN: FXNRXQ. DOI: https://doi.org/10.1134/S1069351321060094.
- Koronovskii N. V., Bryantseva G. V. Opasnye prirodnye protsessy [Hazardous Natural Processes]. Moscow, INFRA-M, 2024, 233 pp. (In Russian)
- Novikov A. V., Gubinsky D. N., Zaray E. A. Logging while drilling — efficient time management and reliable base for estimating volumetric parameters of a reservoir, Actual Problems of Oil and Gas, 2021, no. 3, pp. 49–60 (In Russian). EDN: OWPUCJ. DOI: https://doi.org/10.29222/ipng.2078-5712.2021-34.art4.
- Klyachkin V. N., Kravtsov Yu. A. Irregularities in multivariate statistical control of a technological process, Software & Systems, 2016, no. 3, pp. 192–197 (In Russian). EDN: XEPQLZ. DOI: https://doi.org/10.15827/0236-235X.115.192-197.
- Vial G. Understanding digital transformation: A review and a research agenda, J. Strat. Inf. Syst., 2019, vol. 28, no. 2, pp. 118–144. DOI: https://doi.org/10.1016/j.jsis.2019.01.003.
Supplementary files







