Kamensky M.V., Bredikhin S.N. Algorithmic Procedures of Identifying Advertisement Texts in Mass Media Discourse

DOI: https://doi.org/10.15688/jvolsu2.2025.1.6

Mikhail V. Kamensky

Doctor of Sciences (Philology), Professor, Department of Linguistics, Linguodidactics and Intercultural Communication, North-Caucasus Federal University, Stavropol, Russia

This email address is being protected from spambots. You need JavaScript enabled to view it.

https://orcid.org/0000-0001-8358-9516

Sergey S. Bredikhin

Doctor of Sciences (Philology), Professor, Department of Theory and Practice of Translation, North-Caucasus Federal University, Stavropol, Russia

This email address is being protected from spambots. You need JavaScript enabled to view it.

https://orcid.org/0000-0002-2191-4982


Abstract. The article presents an algorithm of identifying advertisement blocks in mass media content and determining the type of the given text as either an advertisement or an informative text, which is enabled through automation with the aid of intellectual semantic and syntactic analysis systems. The GATE corpus manager is used as the development environment for the algorithm, and the ANNIE Gazetteer, JAPE Transducer, and Java Regexp Annotator are used as the principal processing resources for the presented algorithm. The use of ANNIE Gazetteer enables the automated identification of the most common lexical units typical of advertisements, as well as various lexical and syntactic markers of the advertisement content. The JAPE Transducer technology enables the development of an algorithm aimed at identifying an array of lexical and syntactic means of psychological influence. Identification of lexical repetitions of proper nouns is performed using a regular expression for the Java Regexp Annotator processing resource. The list of tokens used as advertisement content markers is identified and described. It is noted that lexical and syntactic means of manipulative influence dominate in advertisement texts. Research findings indicate a significant difference in the search results ratio between advertisements and informative texts when advertisements are identified automatically with the aid of formal markers. This proves the effectiveness of natural language processing systems in identifying messages with explicit and implicit advertisement content, determining the discursive type of media texts, and classifying them as either informative texts or advertisements.

Key words: automated analysis system, semantic-and-syntactic analyzer, mass media, corpus analysis, manipulative discourse, advertisement content, automated search algorithms.

Citation. Kamensky M.V., Bredikhin S.N. Algorithmic Procedures of Identifying Advertisement Texts in Mass Media Discourse. Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriya 2. Yazykoznanie [Science Journal of Volgograd State University. Linguistics], 2025, vol. 24, no. 1, pp. 64-78. (in Russian). DOI: https://doi.org/ 10.15688/jvolsu2.2025.1.6

Algorithmic Procedures of Identifying Advertisement Texts in Mass Media Discourse by Kamensky M.V., Bredikhin S.N. is licensed under CC BY 4.0

Attachments:
Download this file (4_Kamensky_Bredikhin.pdf) 4_Kamensky_Bredikhin.pdf
URL: https://l.jvolsu.com/index.php/en/component/attachments/download/3094
26 Downloads