Information-Theoretic method for classification of texts


如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

We consider a method for automatic (i.e., unmanned) text classification based on methods of universal source coding (or “data compression”). We show that under certain restrictions the proposed method is consistent, i.e., the classification error tends to zero with increasing text lengths. As an example of practical use of the method we consider the classification problem for scientific texts (research papers, books, etc.). The proposed method is experimentally shown to be highly efficient.

作者简介

B. Ryabko

Institute of Computational Technologies; Novosibirsk State University

编辑信件的主要联系方式.
Email: boris@ryabko.net
俄罗斯联邦, Novosibirsk; Novosibirsk

A. Gus’kov

Institute of Computational Technologies; Russian National Public Library for Science and Technnology

Email: boris@ryabko.net
俄罗斯联邦, Novosibirsk; Novosibirsk

I. Selivanova

Novosibirsk State University; Russian National Public Library for Science and Technnology

Email: boris@ryabko.net
俄罗斯联邦, Novosibirsk; Novosibirsk

补充文件

附件文件
动作
1. JATS XML

版权所有 © Pleiades Publishing, Inc., 2017