Svetlov A.V., Komendantov A.S. Automation of the Process for Obtaining Linguistic Information: State-of-the-Art Capabilities

DOI: https://doi.org/10.15688/jvolsu2.2017.2.4

Andrey V. Svetlov

Candidate of Sciences (Physics and Mathematics), Associate Professor Department of Mathematical Analysis and Function Theory, Volgograd State University

Prosp. Universitetsky, 100, 400062 Volgograd, Russian Federation

This email address is being protected from spambots. You need JavaScript enabled to view it. , This email address is being protected from spambots. You need JavaScript enabled to view it.

http://orcid.org/0000-0002-8764-6132 

Anatoly S. Komendantov

Student, Institute of Mathematics and IT, Volgograd State University

Prosp. Universitetsky, 100, 400062 Volgograd, Russian Federation

This email address is being protected from spambots. You need JavaScript enabled to view it.

http://orcid.org/0000-0001-5009-498X 


Abstract. The paper is devoted to the process automation for solution of some problems in linguistic analysis. The review part of the article describes the variety of current linguistic software. We give its classification as follows: electronic dictionaries and thesauri, text conversion programs and text generators, programs for analysis and linguistic processing of documents, natural language processing systems. For each group we mention some examples of relevant applications or web services. In addition, we discuss current capabilities of the software, their scope of use and development prospects. In the main part of the work we overview the add-on we created for the MyStem stemming utility by Ilya Segalovich. The application adds to the features of the utility a user-friendly graphical interface that is easy to learn and intuitive to users who do not specialize in information technology. The algorithm implemented in the software is based on using the results of stemming process to solve some specific problems. It intercepts the output of the MyStem utility, then reformats it and run some specific analysis. The results of this analysis are the basis for main processes of the addon.
This way we can get the frequency analysis of the text, can extract any certain parts of speech, and select inciting words in the text. The examples in this part of paper show the results of all units of the software. In conclusion we made several remarks on the prospects for the development of our application.

Key words: automation, linguistic analysis, morphological analysis, automation of linguistic analysis, automation of morphological analysis, stemming, graphical interface, software shell.

Citation. Svetlov A.V., Komendantov A.S. Automation of the Process for Obtaining Linguistic Information: State-of-the-Art Capabilities. Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriya 2, Yazykoznanie [Science Journal of Volgograd State University. Linguistics], 2017, vol. 16, no. 2, pp. 39-46. (in Russian). DOI: https://doi.org/10.15688/jvolsu2.2017.2.4 

Creative Commons License
Automation of the Process for Obtaining Linguistic Information: State-of-the-Art Capabilities by Svetlov A.V., Komendantov A.S. is licensed under a Creative Commons Attribution 4.0 International License.

Attachments:
Download this file (4_Svetlov_Komendantov.pdf) 4_Svetlov_Komendantov.pdf
URL: https://l.jvolsu.com/index.php/en/component/attachments/download/1542
1048 Downloads