Sorokina S.G. Exploring Automated Summarization: From Extraction to Abstraction

DOI: https://doi.org/10.15688/jvolsu2.2024.5.4

Svetlana G. Sorokina

Candidate of Sciences (Philology), Associate Professor, Institute of Linguistics and Intercultural Communication, I.M. Sechenov First Moscow State Medical University

Trubetskaya St, 8, Bld. 2, 119048 Moscow, Russia

This email address is being protected from spambots. You need JavaScript enabled to view it.

https://orcid.org/0000-0002-8667-6743


Abstract. This paper provides a review of AI-powered automated summarization models, with a focus on two principal approaches: extractive and abstractive. The study aims to evaluate the capabilities of these models in generating concise yet meaningful summaries and analyze their lexical proficiency and linguistic fluidity. The compression rates are assessed using quantitative metrics such as page, word, and character counts, while language fluency is described in terms of ability to manipulate grammar and lexical patterns without compromising meaning and content. The study draws on a selection of scientific publications across various disciplines, testing the functionality and output quality of automated summarization tools such as Summate.it, WordTune, SciSummary, Scholarcy, and OpenAI ChatGPT-4. The findings reveal that the selected models employ a hybrid strategy, integrating both extractive and abstractive techniques. Summaries produced by these tools exhibited varying degrees of completeness and accuracy, with page compression rates ranging from 50 to 95%, and character count reductions reaching up to 98%. Qualitative evaluation indicated that while the models generally captured the main ideas of the source texts, some summaries suffered from oversimplification or misplaced emphasis. Despite these limitations, automated summarization models exhibit significant potential as effective tools for both text compression and content generation, highlighting the need for continued research, particularly from the perspective of linguistic analysis. Summaries generated by AI models offer new opportunities for analyzing machine-generated language and provide valuable data for studying how algorithms process, condense, and restructure human language.

Key words: automated summarization, extractive summarization, abstractive summarization, artificial intelligence, neural networks, interdisciplinary research.

Citation. Sorokina S.G. Exploring Automated Summarization: From Extraction to Abstraction. Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriya 2. Yazykoznanie [Science Journal of Volgograd State University. Linguistics], 2024, vol. 23, no. 5, pp. 47-59. DOI: https://doi.org/10.15688/jvolsu2.2024.5.4

Exploring Automated Summarization: From Extraction to Abstraction by Sorokina S.G. is licensed under CC BY 4.0

Attachments:
Download this file (4_Sorokina.pmd.pdf) 4_Sorokina.pmd.pdf
URL: https://l.jvolsu.com/index.php/en/component/attachments/download/3025
7 Downloads