Beliaeva L.N., Kamshilova O.N. Lexicographic Problems of Machine Translation Systems: On the Way from Literal to Neural

DOI: https://doi.org/10.15688/jvolsu2.2024.5.1

Larisa N. Beliaeva

Doctor of Sciences (Philology), Professor, Department of Educational Technologies in Philology, Herzen State Pedagogical University of Russia

Reki Moiki Emb., 48, 191186 Saint Petersburg, Russia

This email address is being protected from spambots. You need JavaScript enabled to view it.

https://orcid.org/0000-0002-8622-4595

Olga N. Kamshilova

Candidate of Sciences (Philology), Associate Professor, Department of Educational Technologies in Philology, Herzen State Pedagogical University of Russia

Reki Moiki Emb., 48, 191186 Saint Petersburg, Russia

Associate Professor, Department of Linguistics and Translation Studies, Saint Petersburg University of Management Technologies and Economics

Prosp. Lermontovsky, 44, 190020 Saint Petersburg, Russia

This email address is being protected from spambots. You need JavaScript enabled to view it.

https://orcid.org/0000-0002-1488-2206


Abstract. The article discusses some current issues of interpreting out-of-vocabulary words by modern machine translation systems (MT systems) in the context of changing forms and ways of maintaining an automatic dictionary. It provides a critical outline of the typology of MT systems and strategies for their development. It describes the impact of fast developing software and technologies on these strategies and analyzes the changes they bring into the forms of dictionary support. The research shows that the linguistic support and the structure of automatic dictionaries, whatever the MT system is, are fundamentally important for ensuring the quality of translation. Despite all the success of neural MT (NMT) systems, their automatically updated vocabulary databases do not record words characterized by terminological specificity and low frequency in the special texts and text corpora on which the system is trained. Analysis of translations performed by two popular NMT systems – Google Translate and Yandex Translate – has proven that they fail to process and unify the translation of words that are not entered in the system dictionaries, a task used to be solved easily by users of all types of MT systems with the help of automatic dictionaries. With statistic-based automatic dictionaries it remains a pressing problem and requires a special approach when editing MP results.

Key words: machine translation, machine translation strategy, typology of machine translation systems, automatic dictionary, out-of-vocabulary words, linguistic support.

Citation. Beliaeva L.N., Kamshilova O.N. Lexicographic Problems of Machine Translation Systems: On the Way from Literal to Neural. Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriya 2. Yazykoznanie [Science Journal of Volgograd State University. Linguistics], 2024, vol. 23, no. 5, pp. 6-19. (in Russian). DOI: https://doi.org/10.15688/jvolsu2.2024.5.1

Lexicographic Problems of Machine Translation Systems: On the Way from Literal to Neural by Beliaeva L.N., Kamshilova O.N. is licensed under CC BY 4.0

Attachments:
Download this file (1_Beliaeva_Kamshilova.pmd.pdf) 1_Beliaeva_Kamshilova.pmd.pdf
URL: https://l.jvolsu.com/index.php/en/component/attachments/download/3019
8 Downloads