WaveNet vocoder for prediction of time series with extreme events

N. V. Gromov; Громов Н. В.; T. A. Levanova; Леванова Т. А.

doi:10.17816/gc623433

WaveNet vocoder for prediction of time series with extreme events

Autores: Gromov N.V.¹, Levanova T.A.¹
Afiliações:
1. National Research Lobachevsky State University of Nizhny Novgorod
Edição: Volume 18, Nº 4 (2023)
Páginas: 847-849
Seção: Conference proceedings
URL: https://journal-vniispk.ru/2313-1829/article/view/256352
DOI: https://doi.org/10.17816/gc623433
ID: 256352

Citar

Texto integral

Acesso aberto
Acesso é fechado

Acesso está concedido
Acesso é fechado

Somente assinantes

Resumo
Texto integral
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

Extreme events are typically defined as rare or unpredictable events that deviate significantly from typical behavior. Despite this, objective criteria for extreme events have yet to be established. Rareness may be characterized by certain scales or spatial and temporal boundaries, while intensity is an indication of an event’s potential to cause a significant change. One of the most prominent occurrences of extreme events in both neuroscience and medicine is in the case of epileptic seizures [1].

In speech synthesis, vocoder networks like WaveNet [2] generate audio. The model is a multi-layer convolutional neural network that functions as a causal filter and doesn’t predict the future. Due to this quality, the vocoder may have potential in time series prediction. Audio time series can be regarded as a dynamic system characterized by unpredictable switching regimes. For instance, transitioning from one letter to another can result in significant deviations in amplitude, similar to extreme events. This network receives r previous input counts known as a receptive field, and uses them to predict the next sample. The network is tree-like in structure, with exponentially increasing distances between subsequent layers of inputs. This is a necessary feature since the receptive field r is usually quite large, on the order of one or two thousand. Without this exponential increase in distance, the number of layers would depend linearly on r. Recurrent neural networks pose a challenge in optimizing the loss function when predicting time series sequences, as they tend to predict samples very similar to the previous one, causing the network to converge towards the mode. However, in a convolutional network, the output to the model will be longer due to the large receptive field. In the case of sound analysis, for instance, multiple oscillations occur within a given timeframe and the network does not elevate any specific sample.

The study used artificial data generated from two coupled Hidmarsh–Rose neurons with chemical synaptic couplings. The observed variable was determined by the biological significance of the system, specifically the total membrane potential. The results exhibited extreme events across various coupling parameter values. Based on prior research [3], a numerical standard was selected for the events. The WaveNet vocoder model exhibits a 91% accuracy rate and 82% recall rate when forecasting extreme events of the same width as the prediction. It is noteworthy that recall is crucial in the forecast of extreme events since it identifies instances where the model predicted falsely that an extreme event would not occur.

Palavras-chave

extreme events, convolutional neural networks, machine learning, Hidmarsh–Rose neuron, WaveNet, speech technologies

Texto integral

ADDITIONAL INFORMATION

Authors’ contribution. All authors made a substantial contribution to the conception of the work, acquisition, analysis, interpretation of data for the work, drafting and revising the work, final approval of the version to be published and agree to be accountable for all aspects of the work.

Funding sources. This study was supported by the Russian Science Foundation grant No. 19-72-10128.

Competing interests. The authors declare that they have no competing interests.

Sobre autores

N. Gromov

National Research Lobachevsky State University of Nizhny Novgorod

Autor responsável pela correspondência
Email: gromov@itmm.unn.ru
Rússia, Nizhny Novgorod

T. Levanova

National Research Lobachevsky State University of Nizhny Novgorod

Email: gromov@itmm.unn.ru
Rússia, Nizhny Novgorod

Bibliografia

Engel JrJ, Pedley TA. Generalized convulsive seizures. In: Tassinar CA, Michelucci R, Shigematsu H, et al, editors. Epilepsy: a comprehensive text-book. 1997.
Van den Oord A, Dieleman S, Zen H, et al. Wavenet: a generative model for raw audio. arXiv. 2016;1609:03499. doi: 10.48550/arXiv.1609.03499
Gromov N, Gubina E, Levanova T. Loss functions in the prediction of extreme events and chaotic dynamics using machine learning approach. In: Proceedings of the Fourth International Conference Neurotechnologies and Neurointerfaces (CNN); 2022 Sept 14–16; Kaliningrad, Russian Federation. Kaliningrad; 2022. P. 46–50. doi: 10.1109/CNN56452.2022.9912515

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro