Investigation of features for extraction of named entities from texts in Russian

V. A. Mozharova; N. V. Lukashevich

doi:10.3103/S0005105517030049

Investigation of features for extraction of named entities from texts in Russian

Авторы: Mozharova V.A.¹, Lukashevich N.V.²
Учреждения:
1. Department of Computational Mathematics and Cybernetics
2. Scientific Research Computational Center
Выпуск: Том 51, № 3 (2017)
Страницы: 127-134
Раздел: Text Processing Automation
URL: https://journal-vniispk.ru/0005-1055/article/view/150171
DOI: https://doi.org/10.3103/S0005105517030049
ID: 150171

Цитировать

Полный текст

Открытый доступ
Доступ закрыт

Доступ предоставлен
Доступ закрыт

Только для подписчиков

Аннотация
Об авторах
Список литературы
Дополнительные файлы
Статистика

Аннотация

This paper considers various features for extracting named entities from texts in Russian, which are used within the approaches based on machine learning, including the features of a token itself (lexeme), as well as vocabulary, contextual, cluster, and two-stage features. The contribution of each feature to improving the quality of extraction of named entities is studied. The CRF-classifier is used as a method of machine learning in the experiments that are described in this paper. The contribution of features is compared based on two open collections using the F-measure.

Ключевые слова

named entity, information extraction, machine learning

Об авторах

V. Mozharova

Department of Computational Mathematics and Cybernetics

Автор, ответственный за переписку.
Email: valerie.mozharova@gmail.com
Россия, Moscow, 119991

N. Lukashevich

Scientific Research Computational Center

Email: valerie.mozharova@gmail.com
Россия, Moscow, 119991

Дополнительные файлы

Доп. файлы

Действие

1. JATS XML

Скачать

Имя пользователя
Пароль
Запомнить меня

Забыли пароль?	Регистрация

Имя пользователя
Пароль
Запомнить меня

Забыли пароль?	Регистрация