A Thematic Coherence Study of a Bilingual Corpus of Articles on Oil and Gas Research


如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

Structural differences between scientific articles that arise from their translation from Russian into English are studied using the modal topic modeling technique. Each collected document is represented by two modes, that is, English and Russian. As a result of the topic modeling, the Φ and Θ bimodal matrices are obtained. Analysis of the Φ matrix showed that the topics were divided according to the degree of conformity between Russian and English terms when the words are considered in descending order of probability. For 90% of the topics, the English words fully match the Russian ones. Analysis of the Θ matrix showed that for 99% of the documents there is a subject with a value greater than 0.95. Thus, most of the documents are monotopical, which does not depend on the document language.

作者简介

F. Krasnov

Gazpromneft Research and Development Center

编辑信件的主要联系方式.
Email: Krasnov.FV@gazpromneft-ntc.ru
俄罗斯联邦, St. Petersburg, 190000

M. Shvartsman

National Electronic Information Consortium; Russian State Library

编辑信件的主要联系方式.
Email: shvar@neicon.ru
俄罗斯联邦, Moscow, 115114; Moscow, 119019

A. Dimentov

National Electronic Information Consortium

编辑信件的主要联系方式.
Email: adimentov@neicon.ru
俄罗斯联邦, Moscow, 115114

A. Sen

St. Petersburg State University

编辑信件的主要联系方式.
Email: anastasiia.sen@apmath.spbgu.ru
俄罗斯联邦, St. Petersburg, 199034

补充文件

附件文件
动作
1. JATS XML

版权所有 © Allerton Press, Inc., 2019