Some Features of Literary Texts when Comparing them to Determine their Authorship
- Authors: Akhobadze G.N.1, Rusyaeva E.Y.1
-
Affiliations:
- V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences
- Issue: No 2 (2024)
- Pages: 74-85
- Section: Intelligent systems and technologies
- URL: https://journal-vniispk.ru/2071-8632/article/view/288040
- DOI: https://doi.org/10.14357/20718632240207
- EDN: https://elibrary.ru/WGWOTS
- ID: 288040
Cite item
Abstract
A method for analyzing literary author's texts based on selecting the most frequent auxiliary parts of speech characteristic of a particular author's style and calculating their weighting coefficients has been developed. This linguistic analysis of natural language text (NLP) is based on the calculation of the most frequently used prepositions, conjunctions and particles in literary works. The process of calculating weight coefficients, determined by the ratio of the values of auxiliary parts of speech in the text to its total volume, has been analyzed in detail. Experimental results on establishing the authorship of literary texts for two authors are presented. The results were obtained by comparing the numerical values of the same type of weighting coefficients, expressed as percentages. The theoretical and practical results obtained can be used to analyze, identify linguistic features, and differences not only in literary texts, but, in the future, in texts of any genre and style.
About the authors
Gurami N. Akhobadze
V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences
Author for correspondence.
Email: ahogur@ipu.ru
Professor, Doctor of technical sciences
Russian Federation, MoscowElena Ya. Rusyaeva
V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences
Email: rusyaeva@ipu.ru
PhD
Russian Federation, MoscowReferences
- Mikheev M.Yu., Erlikh L.I. Idiostyle profile and determination of the authorship of the text by the frequencies of function words // Scientific and technical information. Series 2: Information processes and systems. No. 2, 2018.– p. 25-34
- Mikheev M.Yu., Erlikh L.I. Text staples and their frequencies as a distinctive feature of the author's idiostyles Electronic resource. Access date 03/15/2024. https://ruslang.ru/sites/default/files/doc/grigoriev2022/Mish.pdf
- Orlova, M.V. Mikheev M.Yu., Erlikh L.I. On the differences between Russian scientific idiostyle and artistic style in terms of the frequency of use of text staples / Questions of literature. No. 1 2022, pp. 118-140.
- Kukushkina O.V., Polikarpov A.A. Frequency and distribution characteristics of Russian prepositions and syntaxes associated with them (according to the “Nuclear Computer Corpus of Texts of Russian Newspapers of the End of the 20th Century”) // Language, consciousness, communication. Issue 47. M.: MAKS Press, volume 47, 2013. .341-362
- Vsevolodova M.V., Kukushkina O.V., Polikarpov A.A. Russian prepositions and prepositional devices. Materials for a functional-grammatical description of real use. Book 1. Introduction to objective grammar and lexicography of Russian prepositional units. Place of publication URSS, 2013. 304p.
- Bolshakova E.I., Vorontsov K.V., Efremova N.E., Klyshinsky E.S., Lukashevich N.V., Sapin A.S. Automatic pro- cessing of texts in natural language and data analysis - M.: National Research University Higher School of Economics, 2017. - 269 p.
- Smirnov I.V., Shelmanov A.O. Semantic-syntactic analysis of natural languages // Artificial intelligence and decision making. - 2013. No. 1. - P. 41-54.
- Morozov N.A. Linguistic spectra: A means for distinguish- ing plagiarism from the true works of one or another famous author: Stylemetric study // News of the Department of Russian Language and Literature of the Imperial Academy of Sciences. - 1915. - T. 20, book. 4. - pp. 93-127.
- Markov A.A. On one application of the statistical method//Izv.Imp.academy.nauk, Ser. 6. 1916. N4, pp. 239-242.
- Text attribution: theory and practice (festivalnauki.ru). Electronic resource. Access date 02/12/2024.
- Smith Peter, Aldrigde W. (2011): Improving Authorship Attribution: Optimizing Burrows’s Delta Method // Journal of Quantitative Linguistics 18, 63–88.
- Ryzhkovich A.Ch. On the question of the treminological status of the preposition // Universum: Philology and art history: electronic scientific journal. 2018. No. 9(55). URL: http://7universum.com/ru/philology/archive/item/6385
- Shalymov D, Granichin O, Klebanov L, Volkovich Z. Literary writing style recognition via a minimal spanning tree- based approach // Expert Systems with Applications 61, 145-153, 2016 doi: 10.1016/j.eswa.2016.05.032
- Suetin V. Yu., Application of frequency characteristics to determine the authorship of literary texts, Vestnik TvGU. Series: Applied Mathematics, 2022, issue 2, 84–89 doi: 10.26456/vtpmk637
- Voronina, M. Yu. Orlov Yu. N. Determining the author of a text using the segmentation method. Federal Research Center "Institute of Applied Mathematics named after. M.
- V. Keldysh Russian Academy of Sciences", 2022. DOI: https://doi.org/10.20537/2076-7633-2022-14-5-1199-1210
- Kislitsyn A.A., Kislitsyna M.Yu. Recognition of sample distributions among a system of standards: the nearest neighbor method // Preprints of IPM im. M.V.Keldysh. 2023. No. 29. – 21 p. https://doi.org/10.20948/prepr-2023-29 https://library.keldysh.ru/preprint.asp?id=2023-2
- Gorozhanov A. I. Creation of a linguistic corpus based on natural language processing tools: planning software solu- tions. / Philological sciences. Questions of theory and practice., 2023. Volume 16. Issue 5. P. 1616-1620.
- Gasparov M.L. Linguistics of verse // News of PAH. Ser. lit. and language. – M., 1994. – T. 53. No. 6. – p. 28-35.
- Fedyanin D. N., Rusyaeva E. Yu., Akhobadze G. N. Lin- guistic text analyzer: Certificate of state registration of a computer program No. 2023668307 RF; Registered 08/25/2023.
- Litres. Electronic resource: https://www.li- tres.ru/book/maks-alekseevich-glebov/chernyy-staratel-67077536/ Date of access 01/10/2024
Supplementary files
