Development of a Parallel Corpus of the Uzbek and Russian Languages
Keywords:
computational linguistics, corpus linguistics, databases, development, MongoDB, NoSQLAbstract
A description of the development of a parallel Uzbek-Russian corpus is presented. The general mechanism of the corpus, the structure of the database of texts, text processing algorithms, as well as automatic control of the corpus using the author's program Uz-Rus-Corp are considered. The development of a parallel corpus will contribute to the organization of machine translation of texts from Uzbek into Russian.