Успехи математических наук
Рецензируемый научный журнал
Главный редактор
- Козлов Валерий Васильевич, академик РАН, доктор физико-математических наук, профессор
Издатель
- Математический институт им. В.А. Стеклова Российской академии наук
Учредители
- МИАН (Математический институт имени В. А. Стеклова Российской академии наук)
- РАН (Российская академия наук)
О журнале
Периодичность
Журнал выходит 6 раз в год.
Индексация
- Российский Индекс Научного Цитирования (РИНЦ) на базе Российской Научной электронной библиотеки (elibrary.ru)
- Math-Net.Ru
- MathSciNet
- zbMATH
- Google Scholar,
- Ulrich's Periodicals Directory
- WorldCat
- Scopus
- Web of Science
- CrossRef
Свидетельство о регистрации ПИ № ФС 77 - 69578 от 02.05.2017.
Цели и задачи
Журнал "Успехи математических наук" публикует обзорные статьи по наиболее актуальным разделам математики, краткие сообщения Московского математического общества и информацию о математической жизни в стране и за рубежом. Предназначается для научных работников, преподавателей, аспирантов и студентов старших курсов.
Основной сайт журнала: https://www.mathnet.ru/rm
Переводная версия
Архив английской версии доступен по адресу: https://www.mathnet.ru/eng/umn.
Текущий выпуск



Том 79, № 6 (2024)
Предисловие главного редактора



Accelerated Stochastic ExtraGradient: Mixing Hessian and gradient similarity to reduce communication in distributed and federated learning
Аннотация
Modern realities and trends in learning require more and more generalization ability of models, which leads to an increase in both models and training sample size. It is already difficult to solve such tasks in a single device mode. This is the reason why distributed and federated learning approaches are becoming more popular every day. Distributed computing involves communication between devices, which requires solving two key problems: efficiency and privacy. One of the most well-known approaches to combat communication costs is to exploit the similarity of local data. Both Hessian similarity and homogeneous gradients have been studied in the literature, but separately. In this paper we combine both of these assumptions in analyzing a new method that incorporates the ideas of using data similarity and clients sampling. Moreover, to address privacy concerns, we apply the technique of additional noise and analyze its impact on the convergence of the proposed method. The theory is confirmed by training on real datasets.Bibliography: 45 titles.



On greedy approximation in complex Banach spaces
Аннотация
The general theory of greedy approximation with respect to arbitrary dictionaries is well developed in the case of real Banach spaces. Recently some results proved for the Weak Chebyshev Greedy Algorithm (WCGA) in the case of real Banach spaces were extended to the case of complex Banach spaces. In this paper we extend some of the results known in the real case for greedy algorithms other than the WCGA to the case of complex Banach spaces.Bibliography: 25 titles.



Экстраполяция байесовского классификатора при неизвестном носителе распределения смеси двух классов
Аннотация



Local SGD for near-quadratic problems: Improving convergence under unconstrained noise conditions
Аннотация
Distributed optimization plays an important role in modern large-scale machine learning and data processing systems by optimizing the utilization of computational resources. One of the classical and popular approaches is Local Stochastic Gradient Descent (Local SGD), characterized by multiple local updates before averaging, which is particularly useful in distributed environments to reduce communication bottlenecks and improve scalability. A typical feature of this method is the dependence on the frequency of communications. But in the case of a quadratic target function with homogeneous data distribution over all devices, the influence of the frequency of communications vanishes. As a natural consequence, subsequent studies include the assumption of a Lipschitz Hessian, as this indicates the similarity of the optimized function to a quadratic one to a certain extent. However, in order to extend the completeness of Local SGD theory and unlock its potential, in this paper we abandon the Lipschitz Hessian assumption by introducing a new concept of approximate quadraticity. This assumption gives a new perspective on problems that have near quadratic properties. In addition, existing theoretical analyses of Local SGD often assume a bounded variance. We, in turn, consider the unbounded noise condition, which allows us to broaden the class of problems under study.Bibliography: 36 titles.



Local methods with adaptivity via scaling
Аннотация
The rapid development of machine learning and deep learning has introduced increasingly complex optimization challenges that must be addressed. Indeed, training modern, advanced models has become difficult to implement without leveraging multiple computing nodes in a distributed environment. Distributed optimization is also fundamental to emerging fields such as federated learning. Specifically, there is a need to organize the training process so as to minimize the time lost due to communication. A widely used and extensively researched technique to mitigate the communication bottleneck involves performing local training before communication. This approach is the focus of our paper. Concurrently, adaptive methods that incorporate scaling, notably led by Adam, gained significant popularity in recent years. Therefore, this paper aims to merge the local training technique with the adaptive approach to develop efficient distributed learning methods. We consider the classical Local SGD method and enhance it with a scaling feature. A crucial aspect is that scaling is described generically, allowing us to analyze various approaches, including Adam, RMSProp, and OASIS, in a unified manner. In addition to the theoretical analysis, we validate the performance of our methods in practice by training a neural network.Bibliography: 49 titles.



КРАТКИЕ СООБЩЕНИЯ
О спектральных задачах Колмогорова и Рохлина в классе перемешивающих автоморфизмов



Подсистемы ортогональных систем и восстановление разреженных сигналов при наличии случайных потерь



Перемешивание в случайных динамических системах со стационарным шумом



Распределение нулей многочленов совместной дискретной ортогональности в случае Анжелеско



Об описании периодических элементов эллиптических полей, заданных многочленом третьей степени



О многообразии точек перегиба плоских кубик



МАТЕМАТИЧЕСКАЯ ЖИЗНЬ
К девяностолетию со дня рождения Владимира Николаевича Судакова (1934–2016)



К 90-летию Нины Николаевны Уральцевой


