Preview

Oriental Studies

Advanced search

Categorial–Semantic Markup of Basic Mongolian Lexemes: a Quantitative Perspective

https://doi.org/10.22162/2619-0990-2019-46-6-1156-1175

Abstract

The article defines the categorical-semantic notation (tagging) of the word-forms of the basic lexemes of the Mongolian language; these word-forms are characterized by their occurrence. The article provides a list, containing the basic vocabulary of the modern Mongolian language (454 word-forms). The calculation of the frequency of word-forms was carried out on the material of the General Corpus of the Modern Mongolian Language (GCML).This work is based on the same principles of quantitative Mongolian Study as the previous works of the author. However, the most original feature of this dictionary is the presence of a special categorical-semantic notation (tagging) in it. Such a notation is focused on the tasks of the semantic typology of the world languages. It is designed for potential typological comparability of this dictionary with similarly compiled dictionaries of the other languages of the world. In this article, the basic 454 word-forms of the modern Mongolian language can be found, and they are rather common in the GCML-1a: the absolute frequency in the GCML is over 255, and the relative frequency is respectively 220 ipm (I.) The name of the word-form in a quasi-orthographic recording. The quasi-orthographic recording differs from the proper spelling by removing all the differences in the registers (capital/small letters). (II.) The generalized grammateme (i. e., not only the grammatemes themselves, but also bundles of homographic grammatemes are included here). (III.) The generalized lexeme (i. e., not only proper lexemes are included here, but also homographic lexeme bundles). (IV.) The semantic gloss-mark, attributed to the corresponding word-form or lexeme (strictly speaking, it is assigned to one of the members of the bundle of homographic segments, which coincides with the name of the word-form). This column has a function of an informal mnemonic reminder to indicate which lexical meaning possesses the given word-form. The reminder is intended for a typologist-user, especially not familiar with the Mongolian language (V.) The categorical- semantic gloss-mark, attributed to the given word-form (more precisely, to a bundle of homographic word-forms) in the GCML. This column in the given table is the key. (VI.) The absolute frequency of the word-form (more precisely, the bundle of homographic word-forms) in the GCML-1a. (VII.) The rank of the word-form (more precisely, the bundle of homographic word-forms) in the GCML-1a. (VIII.) The number of the texts from the GCML-1a, in which the given word-form (more precisely, a bundle of homographic word-forms) occurs. (IX.) The rank of the word-form (more precisely, a bundle of homographic word-forms) in the frequency dictionary, ordered by decreasing the number of the texts from the GCML-1a. The categorical-semantic notation (tagging) of the word-forms of the basic lexemes of the Mongolian language is presented in the form of a table, ordered in the direct alphabetical order of the categorical-semantic marks.

About the Author

Sergey A. Krylov
Institute of Oriental Studies of the RAS
Russian Federation
Dr. Sc. (Philology), Leading Research Associate


References

1. Dawa I. et al. Multilingual Text ― Speech Corpus of Mongolian. In: International Symposium on Chinese Spoken Language Processing (ISCSLP 2006). Proceedings. (Kent Ridge, Singapore; December 13–16, 2006). Vol. II. Pp. 759–770. An Internet resource://www.isca-speech.org/archive_open/archive_papers/iscslp2006/B74.pdf (accessed: October 10, 2018). (In Eng.)

2. Krylov S. A. [Theoretical Grammar of Modern Mongolian and Related Issues of General Linguistics]. Vol. 1: ‘Morphemics, Morphonology, Elements of Phonological Transformatorics’. Moscow: Vostochnaya Literatura, 2004. 479 p. (In Russ.)

3. Krylov S. A. [Theoretical Grammar of the Mongolian Language and Related Issues of General Linguistics]. In 6 vol. Vol. 2: ‘A Structure-and-Frequency Model of Modern Mongolian’. Moscow: Vostochnaya Literatura, 2014. 637 p. (In Russ.)

4. Krylov S. A. A Consolidated Corpus of Mongolic Languages: Principles of Syntactical Analysis Revisited. [The Humanities in Souhern Russia: International and Regional Cooperation]. Conf. proc. (Elista; September 14–15, 2016). Elista: Kalmyk Humanities Research Institute of RAS, 2016. Pp. 198–199. (In Russ.)

5. Krylov S. A. A structure-and-frequency model of the Mongolian language on the basis of the General Corpus of Modern Mongolian. Ural-Altaic Studies. 2012. No. 1(6). Pp. 78–105. (In Russ.)

6. Krylov S. A. Compatibility of Mongolian synthetic word forms: a quantitative aspect. Bulletin of the Kalmyk Institute for Humanities of the RAS (Oriental Studies). 2017. No. 4. Pp. 108–133. (In Russ.)

7. Krylov S. A. Hybrid genres of dictionaries revisited: a case study of the Mongolian language. In: [Oriental Studies Readings 2018]. Panina A. S. (comp.). Conf. proc. (Moscow; April 4–6, 2018). Moscow: Institute of Oriental Studies of RAS, 2018. Pp. 33–34. (In Russ.)

8. Krylov S. A. Mongolian analytical constructions: a quantitative perspective. Bulletin of the Kalmyk Institute for Humanities of the RAS (Oriental Studies). 2017. No. 5. Pp. 155–179. (In Russ.)

9. Krylov S. A. Mongolian analytical word forms: an effort of distributive and statistical classification. Oriental Studies. 2018. Vol. 36. Is. 2. Pp. 88–101. (In Russ.)

10. Krylov S. A. Mongolian analytical word forms: an effort of quantitative research. Bulletin of the Kalmyk Institute for Humanities of the RAS (Oriental Studies). 2017. No. 6. Pp. 79–93. (In Russ.)

11. Krylov S. A. On hybrid dictionaries, exemplified by Mongolian. In: [Institute of Oriental Studies of the RAS: Transactions]. Vol. 19: ‘Issues of General and Oriental Linguistics: Lexicology and Lexicography’. Shalyapina Z. M. (ed.), Panina A. S. (ed., comp.). Moscow: Institute of Oriental Studies of RAS, 2018. Pp. 156–165. (In Russ.)

12. Krylov S. A. The General Corpus of the Modern Mongolian Language and its structural-probabilistic model. In: Computational Linguistics and Intellectual Technologies. Conf. proc. (Bekasovo; May 30 – June 3, 2012). Vol. 11 (18). Moscow: Russian State University for the Humanities, 2012. Pp. 331–341. (In Eng.)

13. Krylov S. A., Dybo A. V., Sheymovich A. V. A digital Khakass-Russian dictionary: semantic and derivative tagging. Russian Turkology. 2016. No. 2. Pp. 28–39. (In Russ.)

14. Krylov S. A., Dybo A. V., Sheymovich A. V. Some possibilities of semantic and etymological tagging of corpora for Turkic languages. In: Turkic Languages Processing: TurkLang 2015. Conf. proc. Kazan, 2015. Pp. 304–327. (In Russ. and Eng.)

15. Krylov S. A. Investigating modern Mongolian: a quantitative perspective. Voprosy Jazykoznanija. 2013. No. 5. Pp. 46–57. (In Russ.)

16. Lyashevskaya O. N., Sharov S. A. New Frequency Dictionary of Russian. In: Lyashevskaya O. N., Sharov S. A. [Frequency Dictionary of Modern Russian: as Exemplified by Russian National Corpus]. Moscow: Azbukovnik, 2009. An Internet resource: http://dict.ruslang.ru/freq.php (accessed: September 10, 2019). (In Russ.)

17. Purev J., Hyun Seok Park, Altangerel Ch. Tree adjoining grammars for Mongolian. In: Proceedings of the 3rd International Conference on East-Asian Language, Processing and Internet Information Technology, EALPIIT 2003. Ulaanbaatar, 2003. Pp. 321–323. (In Eng.)

18. Purev J., Odbayar Ch. Corpus building for Mongolian language. In: Proceedings of the 6th Workshop on Asian Language Resources (11–12 January 2008, India). Hyderabad, 2008. Pp. 97–98. (In Eng.)

19. Purev J., Tsolmon Z., Altangerel Ch., and Cheol-Young O. PC-KIMMO-based description of Mongolian morphology. International Journal of Information Processing Systems. 2005. Vol. 1. No. 1. Pp. 41–48. (In Eng.)

20. Sharoff S. Meaning as use: exploitation of aligned corpora for the contrastive study of lexical semantics. In: Proc. of Language Resources and Evaluation Conference (LREC02). (Las Palmas, Spain; May 2002). 2002. (In Eng.)

21. Shaykevich A. Ya. Quantitative research methods in linguistics. In: [Encyclopedic Dictionary of Linguistics]. Moscow: Sovetskaya Entsiklopediya, 1990. P. 231. (In Russ.)

22. Shaykevich A. Ya. Quantitative research methods in linguistics. In: [Great Russian Encyclopedia]. Kravets S. L. (ed.). Vol. 14: ‘Киреев – Конго’. Moscow: Bolshaya Rossiyskaya Entsiklopediya, 2009. P. 478. (In Russ.)

23. Shaykevich A. Ya., Andryushchenko V. M., Rebetskaya N. A. [Russian Prose of 1850–1870s: Distributive-Statistical Analysis of Its Language]. Vol. 1. Moscow: Yazyki Slavyanskoy Kultury, 2013. 504 p. (In Russ.)

24. Shaykevich A. Ya., Andryushchenko V. M., Rebetskaya N. A. [Russian Prose of 1850–1870s: Distributive-Statistical Analysis of Its Language]. Vol. 2. Moscow: Yazyki Slavyanskoy Kultury, 2016. 850 p. (In Russ.)


Review

For citations:


Krylov S.A. Categorial–Semantic Markup of Basic Mongolian Lexemes: a Quantitative Perspective. Oriental Studies. 2019;12(6):1156-1175. (In Russ.) https://doi.org/10.22162/2619-0990-2019-46-6-1156-1175

Views: 482


ISSN 2619-0990 (Print)
ISSN 2619-1008 (Online)