Automatic Text Normalization in Uzbek: Problems, Tools, and Solutions

ASL MANBA
Ushbu maqola dastlab Multidisciplinary Journal of Science and Technology jurnalida nashr etilgan.

qizi, Sobirova Nazira G‘anijon (2025) Automatic Text Normalization in Uzbek: Problems, Tools, and Solutions. Multidisciplinary Journal of Science and Technology.

To'liq matn arxivda mavjud emas — maqolaning asl manbasiga havola pastda berilgan.

Annotatsiya

In recent years, research in the field of Natural Language Processing (NLP) has increased the demand for automated text analysis across multiple languages, including Uzbek. The multi-form, morphologically complex, and stylistically diverse nature of texts written in Uzbek poses certain challenges for automatic analysis. The central focus of this article is the automatic normalization of Uzbek texts—that is, the process of text normalization. It is dedicated to studying the linguistic and technological issues that arise during automatic text normalization in the Uzbek language. Complex morphological structures, polyform words, dialectal variants, Cyrillic-Latin script differences, and non-standard expressions complicate this process. The results of this research contribute to the deeper digital processing of the Uzbek language and to improving the quality of systems for machine translation, speech-to-text conversion, and text analysis.

Hujjat turi: Maqola
Mualliflar:
Muallif
Email
qizi, Sobirova Nazira G‘anijon
UNSPECIFIED
Jurnal / Nashr: Multidisciplinary Journal of Science and Technology
Nashriyot: Center for Tech and Media Research
Sana: 19 Iyun 2025
DOI / ID: oai:ojs.pkp.sfu.ca:article/4181
Kalit so'zlar: Uzbek language, text normalization, natural language processing, artificial intelligence, neural networks, rule-based approach, morphological analysis, BERT, writing systems, linguistic issues
Mavzular: ?? Uzbek language, text normalization, natural language processing, artificial intelligence, neural networks, rule-based approach, morphological analysis, BERT, writing systems, linguistic issues ??
Eslatma: Imported from MJST Journal (OAI id oai:ojs.pkp.sfu.ca:article/4181)
URI: https://arxiv.universalpublishings.com/id/eprint/60746
Amallar
[pin missing: title]
[pin missing: title]