Preview

Oriental Studies

Advanced search

Cognate Identification Neural Network and Its Capabilities to Establish New Etymologies and Borrowing Sources in Eastern Yugur

https://doi.org/10.22162/2619-0990-2025-79-3-720-737

Abstract

Introduction. The paper describes some results obtained from a cognate identification neural network designed to establish new etymologies and sources of borrowings in Eastern Yugur. Materials and methods. The study provides an overview of existing neural network models, results of their work, and characterizes the available dictionaries of Eastern Yugur. The latter’s etymologies have been specified on the basis of Mongolic-language dictionaries uploaded onto the LingvoDoc platform. The work employs the comparative historical method and certain functional tools of the platform that have proved instrumental in identifying cognates for a number of Eastern Yugur words and reconstructing some essentials of Proto-Mongolic. Results. The article describes the principles of the neural network that follows the Siamese pattern and consists of two identical branches. A total of 40 Proto-Mongolic reconstructions — previously known only for the North Mongolic languages — have been implemented. In addition, the paper introduces 11 examples of early Chinese borrowings to Eastern Yugur, since those are available in other Mongolic languages. A number of Proto-Mongolic lexical reconstructions dealing with material culture are noteworthy enough: *(h)iliɣür ‘[press] iron’, *kükür ‘sulfur’, *jaŋ- ‘cement’, *kas ‘jasper, jade’, *kuruɣub- ‘thimble’. Efforts aimed at supplementing existing etymologies with data on Eastern Yugur and — in some cases — those from dictionaries of other Mongolic languages available on the LingvoDoc (Classical Mongolian, Mongolian, Buryat, Oirat, Dagur, Dongxiang, Bonan) and verifying reconstructed lexemes through Chinese dictionaries for borrowings make it possible to deepen our knowledge of Mongol cultural history and even specify sources of certain inventions.

About the Authors

Julia V. Normanskaya
Ivannikov Institute for System Programming of the RAS (25, A. Solzhenitsyn St., 109004 Moscow, Russian Federation) Institute of Linguistics of the RAS (1, Bolshoi Kislovsky Lane, 125009 Moscow, Russian Federation)
Russian Federation

Dr. Sc. (Philology), Chief Research Associate, Leading Research Associate



Oksana V. Goncharova
RUDN University (10/3, Miklouho-Maclay St, 117198 Moscow, Russian Federation)
Russian Federation

Cand. Sc. (Philology), Associate Professor



Viktoria V. Kukanova
Kalmyk Scientific Center of the RAS (8, Ilishkin St., 358000 Elista, Russian Federation)
Russian Federation

Cand. Sc. (Philology), Senior Research Associate, Director



Zayana I. Chushkaeva
Kalmyk Scientific Center of the RAS (8, Ilishkin St., 358000 Elista, Russian Federation)
Russian Federation

Junior Research Associate



References

1. Alreshidi H., Aldhlan K. Auto-extracting method of cognates words in Arabic and English languages. International Journal of Advanced Studies in Computer Science and Engineering (IJASCSE). 2017. Vol. 6. No. 1. Pp. 1–13. (In Eng.)

2. Batsuren Kh., Bella G., Giunchiglia F. A large and evolving cognate database. Language Resources and Evaluation. 2022. Vol. 56. Pp. 1–25. (In Eng.)

3. Ciobanu A. M., Dinu A. M. Building a dataset of multilingual cognates for the Romanian lexicon. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation LREC. Reykjavik, 2014. Pp. 1038–1043. (In Eng.)

4. Jia Lasen, Bao Chaolu (eds.) Eastern Yugur Language Materials. Hohhot: Inner Mongolia People’s Publishing House, 1988. 352 р. (In Chin. and Yug.)

5. Dyen I., Kruskal J. B., Black P. An Indo-European classification: A lexicostatistical experiment. Transactions of the American Philosophical Society. 1992. Vol. 82. No. 5. Pp. 1–132. (In Eng.)

6. Starostin S. A., Dybo A. V., Mudrak O. A. An Etymological Dictionary of Altaic Languages. Leiden: Brill, 2003. 1556 p. (In Eng.)

7. Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006. Vol. 27. No. 8. Pp. 861–874. (In Eng.)

8. Fourrier C., Sagot B. Probing multilingual cognate prediction models. In: Findings of the Association for Computational Linguistics: ACL 2022. Dublin: Association for Computational Linguistics, 2022. Pp. 3786–3801. (In Eng.)

9. Kanojia D., Bhattacharyya P., Kulkarni M., Haffari G. Challenge dataset of cognates and false friend pairs from Indian languages. In: Proceedings of the Twelfth Language Resources and Evaluation Conference. Marseille: European Language Resources Association 2020. Pp. 3096–3102. (In Eng.)

10. Kotwicz W. L. La langue mongole, parlée par les Ouïgours Jaunes près de Kan-tcheou. D’après le s materiaux recueillis pars S. E. Malov et autres voyageurs. Wilno, 1939. Pp. 91–102. (In Fr.)

11. Lessing F. D. Mongolian-English Dictionary. Berkeley; Los Angeles: University of California Press, 1960. XV + 1086 р. (In Mong. and Eng.)

12. Loshchilov I., Hutter F. Decoupled weight decay regularization. In: ICLR 2019. On: Internet Archive. Available at: https://arxiv.org/abs/1711.05101 (accessed: 25 August 2025). (In Eng.)

13. Mitkov R., Pekar V., Blagoev D., Mulloni A. Methods for extracting and classifying pairs of cognates and false friends. Machine Translation. 2007. Vol. 21. No. 1. Pp. 29–53. (In Eng.)

14. Nugteren H. Mongolic Phonology and the Qinghai-Gansu Languages. Utrecht: LOT, 2011. 563 p. (In Eng.)

15. Pulini M., List J.-M. Finding language-internal cognates in Old Chinese. Bulletin of Chinese Linguistics. 2024. Vol. 17. No. 1. Pp. 53–72. (In Eng.)

16. Rama T. Siamese convolutional networks for cognate identification. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016. Pp. 1018–11027. (In Eng.)

17. Rama T. Siamese convolutional networks for cognate identification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: Association for Computational Linguistics, 2016. Pp. 123–132. (In Eng.)

18. Róna-Tas A. Tibetan loanwords in Shera Yögur language. In: Acta Orientalia Hungarica 15. 1962. Pp. 259–271. (In Eng.)

19. Rybatzki V. Die Personennamen und Titel im Mittelmongolischen Dokumente. Eine lexikalische Untersuchung. Helsinki: Yliopistopaino Oy, 2006. 841 р. (In Germ.)

20. Schuster M., Paliwal K. K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing. 1997. Vol. 45. No. 11. Pp. 2673–2681. (In Eng.)

21. Sun Zhu (ed.) Dictionary of the Mongolic Languages. Xining: Qinghai renmin chubanshe, 1990. 844 р. (In Chin., Mong., etc.)

22. Tompson J., Jain A., LeCun Y., Bregler C. Efficient object localization. In: Using Convolutional Networks. Proceedings of CVPR. On: Internet Archive. Available at: https://arxiv.org/pdf/1411.4280 (accessed: 25 August 2025). (In Eng.)

23. Vaswani A., Shazeer N., Parmar N. Attention is all you need. In: Advances in Neural Information Processing Systems 30. 2017. On: Internet Archive. Available at: https://arxiv.org/pdf/1706.03762 (accessed: 25 August 2025). (In Eng.)

24. Wichmann S., Holman E. W. Languages with longer words have more lexical change. In: Approaches to Measuring Linguistic Differences. Berlin, Boston: De Gruyter Mouton, 2013. Рp. 249–281. (In Eng.)

25. Zhaonasitu. Eastern Yugur: An Introduction. Beijing: Publishing House of Minority Nationalities, 1981. 122 p. (In Chin.)

26. Zhaonasitu. Eastern Yugur Vocabulary. Hohhot: Inner Mongolia University, 1982. 129 p. (In Chin.)

27. Luvsandendev A., Tsedendamba Ts. (eds.) Unabridged Academic Mongolian-Russian Dictionary. In 4 vols. Vol. 1: А–Г. Moscow: Academia, 2001. 520 p. (In Mong. and Russ.)

28. Luvsandendev A., Tsedendamba Ts. (eds.) Unabridged Academic Mongolian-Russian Dictionary. In 4 vols. Vol. 2: Д–О. Moscow: Academia, 2001. 536 p. (In Mong. and Russ.)

29. Luvsandendev A., Tsedendamba Ts. (eds.) Unabridged Academic Mongolian-Russian Dictionary. In 4 vols. Vol. 3: Ө–Ф. Moscow: Academia, 2001. 440 p. (In Mong. and Russ.)

30. Luvsandendev A., Tsedendamba Ts. (eds.) Unabridged Academic Mongolian-Russian Dictionary. In 4 vols. Vol. 4: Х–Я. Moscow: Academia, 2001. 532 p. (In Mong. and Russ.)

31. Taǧiyev M. T. Et al. (eds.) Unabridged Azerbaijani-Russian Dictionary. In 4 vols. Vol. 2. Baku: Șәrq-Qәrb, 2006. 848 p. (In Az. and Russ.)

32. Barga [Mongolian]-Chinese Dictionary. Hohhot: Inner Mongolia University, 1983. 226 p. (In B.-Mong. and Chin.)

33. Dictionary of Bonan, Classical Mongolian, and Chinese. Hohhot: Inner Mongolia People’s Publishing House, 1986. 265 p. (In Bon., Mong. and Chin.)

34. Shagdarov L. D., Cheremisov K. M. (comps.) Buryat-Russian Dictionary. In 2 vols. Vol. 1: A–Н. Ulan-Ude: Respublikanskaya Tipografiya, 2010. 636 p. (In Bur. and Russ.)

35. Shagdarov L. D., Cheremisov K. M. (comps.) Buryat-Russian Dictionary. In 2 vols. Vol. 2: О–Я. Ulan-Ude: Respublikanskaya Tipografiya, 2010. 708 p. (In Bur. and Russ.)

36. Gruntov. I. A., Mazo O. M. Lexicostatistical classification of the Mongolic languages. Journal of Language Relationship. 2015. Vol. 13. No. 3–4. Pp. 205–255. (In Russ.)

37. Dongxiang-Chinese Dictionary. Second ed. Lanzhou: Gansu Nationalities Publishing House, 2012. 548 p. (In Dong. and Chin.)

38. Nadelyaev V. M. et al. Dictionary of Old Turkic. Leningrad: Nauka, 1969. 715 p. (In Old Turk. and Russ.)

39. Tumurdey G., Tsybenov B. D. (comps.) A Brief Dagur-Russian Dictionary. Zh. Badagarov (ed.). Ulan-Ude: Buryat Scientific Center (SB RAS), 2014. 236 p. (In Dag. and Russ.)

40. Yudakhin K. K. (comp.) Kyrgyz-Russian Dictionary. In 2 vols. Vol. 1. Frunze: Kyrgyz Soviet Encyclopedia, 1985. 503 p. (In Kyrg. and Russ.)

41. Muniev B. D. (ed.) Kalmyk-Russian Dictionary. Moscow: Russkiy Yazyk, 1977. 768 p. (In Kalm. and Russ.)

42. Chinese-Russian Dictionary. Beijing: Commercial Press, 1990. 1250 p. (In Chin. and Russ.)

43. Malov S. E. Eastern Yugur Language. Alma-Ata: Kazakh SSR Academy of Sciences, 1957. 197 p. (In Russ.)

44. Potanin G. N. Tangut-Tibetan Peripheries of China and Central Mongolia: Travels of G. Potanin, 1884–1886. In 2 vols. Vol. 1. St. Petersburg: Imperial Russian Geographical Society, 1893. 358 p. (In Russ.)

45. Potanin G. N. Tangut-Tibetan Peripheries of China and Central Mongolia: Travels of G. Potanin, 1884–1886. In 2 vols. Vol. 2. St. Petersburg: Imperial Russian Geographical Society, 1893. 472 p. (In Russ.)

46. Kontsevich L. R., Rassadin V. I., Leman Ya. D. (comps.) Etymological Dictionary of the Mongolic Languages. In 3 vols. G. Sanzheev (ed.). Vol. 1: A–E. Moscow: Institute of Oriental Studies (RAS), 2015. 224 p. (In Mong. and Russ.)

47. Kontsevich L. R., Rassadin V. I., Leman Ya. D. (comps.) Etymological Dictionary of the Mongolic Languages. In 3 vols. G. Sanzheev (ed.). Vol. 2: G–P. Moscow: Institute of Oriental Studies (RAS), 2016. 232 p. (In Mong. and Russ.)

48. Kontsevich L. R., Rassadin V. I., Leman Ya. D. (comps.) Etymological Dictionary of the Mongolic Languages. In 3 vols. G. Sanzheev (ed.). Vol. 3: Q–Z. Moscow: Institute of Oriental Studies (RAS), 2018. 240 p. (In Mong. and Russ.)

49. Tenishev E. R., Todaeva B. Kh. The Yugur Languages. Moscow: Nauka, 1966. 84 p. (In Russ.)

50. Todaeva B. Kh. Bonan Language. Moscow: Nauka, 1964. 158 p. (In Russ.)

51. Todaeva B. Kh. Monguor Language: Study, Texts, Vocabulary. Moscow: Nauka — GRVL, 1973. 392 p. (In Russ.)

52. Todaeva B. Kh. Dagur Language. Moscow: Nauka — GRVL, 1986. 190 p. (In Russ.)

53. Todaeva B. Kh. Dictionary of Xinjiang Oirat: Compiled from Jangar Epic Texts and Original Field Recordings. Elista: Kalmykia Book Publ., 2001. 497 p. (In Oir. and Russ.)

54. Ganiev F. A. (ed.) Tatar-Russian Dictionary. Kazan: Tatarstan Book Publ., 2004. 488 p. (In Tat. and Russ.)

55. Dictionary of Eastern Yugur, Classical Mongolian, and Chinese. Hohhot: Inner Mongolia University, 1984. 180 p. (In E. Yug., Mong. and Chin.)


Review

For citations:


Normanskaya J., Goncharova O., Kukanova V., Chushkaeva Z. Cognate Identification Neural Network and Its Capabilities to Establish New Etymologies and Borrowing Sources in Eastern Yugur. Oriental Studies. 2025;18(3):720-737. (In Russ.) https://doi.org/10.22162/2619-0990-2025-79-3-720-737

Views: 36


ISSN 2619-0990 (Print)
ISSN 2619-1008 (Online)