Font size:
Page background:
Letter spacing:
Images:
Disable visually impaired version close

Language: The carrier of cultural code

The issue of safeguarding cultural heritage and linguistic variety online becomes increasingly urgent every year, and one of the sessions at RIGF 2026 was dedicated to this topic.

What is a cultural code? It is the key to understanding a people’s culture, its unique characteristics, and its history. It is passed down through language, art, and traditions. And language here is not just a tool for communication but the primary carrier of meaning,” said moderator Andrey Vorobyev, Director of the Coordination Center for TLD .RU/.РФ, while opening the session.

Participants sought to answer the following questions: what happens to the cultural code as life increasingly moves into the digital space? Which digital technologies help preserve languages?

If a language is not represented online, it gradually disappears from everyday life. If there are no keyboard layouts, no dictionaries, no machine translation systems, then using that language in the digital environment becomes extremely difficult. This is not a hypothetical threat but a reality facing dozens of minority languages across the globe,” Andrey Vorobyev emphasized.

According to Sergey Chumarev of the Russian Foreign Ministry, the main threat is the loss of native speakers, when the younger generation stops using their native language for communication and ceases to be its native speaker. He stressed that Russia is making a significant contribution to the digitalization of indigenous languages, including through the International Decade of Indigenous Languages, implemented under UNESCO in 2022-2032. Moreover, this digital shift is happening not only through government support but also thanks to grassroots efforts by activists working to preserve native languages.

Almost all countries with multiethnic populations have some element of public-private partnership in this area,” Sergey Chumarev concluded.

Last fall, at the World Telecommunication Development Conference (WTDC) in Baku, ICANN proposed developing new indicators, in collaboration with the International Telecommunication Union (ITU), to assess how widely universal acceptance of multilingual domain names and email addresses has been adopted in different countries. Farid Nakhli, Program Coordinator at the ITU Regional Office for the CIS, noted in his presentation that ITU indicators often appear in the target metrics of telecommunications administrations.

Metrics assessing how well different countries have implemented universal acceptance in the technical systems of public and government services could give them extra incentive to pay attention to this issue. However, the initiative to develop them must come from the relevant community,” the speaker explained.

Timur Tsybikov of the Federal Agency for Ethnic Affairs presented the results of an extensive monitoring of the status and development of Russia’s official languages in the field of information technology, conducted by the agency. Forty-one constituent regions of the Russian Federation took part in the monitoring, with data collected on 70 indigenous and regional languages. Since 2024, the number of languages for which electronic dictionaries have been created has increased from 59 in 2024 to 68 in 2025. Language corpora have now been created for 42 languages, and a numeric keyboard has been developed for 59.

Alexander Bolkhovityanov of Yandex noted that Russia is home to 194 ethnic groups speaking more than 300 languages. Of these, 150 are the languages of Russia’s indigenous and regional communities.

But only representatives of the regions can compile these languages into corpora – without their participation, we cannot accomplish this; it is a very complex task,” the speaker explained.

He also noted that going digital influences the evolution of languages, with new words reflecting modern reality and new concepts emerging. This work is being carried out by the company as part of the Languages Spoken in Russia project.

Sergei Markov of Sberbank spoke about the importance of multilingualism for modern society. He emphasized that there are over 7,000 languages in the world, each carrying a unique cultural and linguistic heritage. The loss of this heritage impoverishes humanity. Large language models (LLMs) contribute significantly to preserving these languages: they support low-resource languages, many of which are on the verge of extinction, with limited written material. At the same time, multilingual models improve the quality of machine translation: adding multilingual data to LLM training corpora expands their linguistic capabilities. The models’ “intelligence” also grows due to the unique cultural elements embedded in different linguistic traditions. Furthermore, there is an ethical dimension: the development of multilingual models promotes more equitable technological development by reducing the dominance of a few of the world’s most widely spoken languages.

Igor Pozdeyev of Udmurt Federal Research Center, Ural Branch of the Russian Academy of Sciences, discussed practical steps being taken to preserve and develop the Udmurt language. In 2019, the National Corpus of the Udmurt Language was created – an information and reference system based on a digital collection of Udmurt texts. The speaker explained that over the years, more than 11 million word forms have been added to the corpus, a corpus of historical Udmurt written records has been developed, and a part-of-speech tagger has been created. Igor Pozdeyev also discussed collaboration with Yandex. As a result of updating the Udmurt language translation model in Yandex, based on measurements taken in March 2026, the quality of translation from Udmurt to Russian (udm-ru) and from Russian to Udmurt (ru-udm) improved significantly: by 4.5 times for Udmurt-to-Russian translation, and by 7.86 times for Russian-to-Udmurt.

Maria Kolesnikova of the Coordination Center for TLD .RU/.РФ reported that in 2025, the list of acceptable Cyrillic characters in .РФ was expanded to include 18 minority languages of Russia in addition to Russian. This list will be expanded further, and all official languages spoken in the regions of the Russian Federation will soon be supported in the .РФ ccTLD. In this way, the Coordination Center is contributing to the preservation and development of multilingualism and cultural diversity across our country in the digital environment.

In conclusion, Andrey Vorobyev noted that Russia is one of the few countries where the digital preservation of linguistic diversity is being addressed systemically, which is yielding tangible results. However, a number of serious challenges remain, ranging from Unicode support gaps to a shortage of datasets and domain experts. Overcoming these challenges requires the active participation of native speakers, volunteers, and regional communities.

08.04.2026