Basic Words in Farsi: A Comparison of Six Studies

Document Type : Research Paper

Authors

---

Abstract

Vocabulary is one of the most important factors in foreign/second language teaching. Many scholars believe that the importance of this component of language is to the extent that learning the vocabulary of a language is synonymous with learning the language itself. Therefore, the selection and grading of lexical content of language teaching curriculums has become very important. Accordingly, using the marked words of the language and including the personal tastes and preferences in selecting the lexical contents of language teaching resources are the major challenges in Persian language learning. This study aimed to identify the most frequent words of Persian language in journalistic texts. For this purpose, over 100 working days a corpus of over one million and two hundred thousand words was extracted from widely circulated newspapers and was recorded to the database which was developed for this study. The corpus covered seven genres which included culture, society, politics, sports, science, economy, and fiction. Then, the frequency of each words was counted in the corpus. The resulting list of high frequency words was then compared with and validated against the results of other projects, including Hasani (1384), Bijankhan (1390), Mahakweb (1387), Hamshahri (2009), and Nematzadeh et al. (1390). The results of this comparison showed about 30% difference with the findings of other projects. This difference seems to be normal because the source corpora were not the same in these projects.

Keywords


بی‌جن‌خان،‌ م. (1390). فرهنگ بسامدی براساس پیکره‌ی متنی زبان فارسی امروز. تهران: مؤسسه‌ی انتشارات دانشگاه تهران.
حسنی، ح. (1384). واژه‌های پرکاربرد فارسی امروز بر مبنای پیکره‌ی یک‌میلیون لغتی شامل بیش از  8000 لغت قاموسی و غیر قاموسی. ‌تهران: کانون زبان ایران.
درودی، ا.، برادرآن‌هاشمی، ه.، آل‏احمد، ا.، زارع بیدکی، ع.، حبیبیان، ا.، مهدیخانی، ف.، شاکری، آ.، و رهگذر، م. (1387). مجموعه محک استاندارد برای تحقیقات بازیابی اطلاعات وب فارسی. تهران: گزارش فنی گروه تحقیقاتی پایگاه داده‌ها دانشگاه تهران، شماره: DBRG-TR-138702.
نعمت‌زاده، ش.، دادرس، م.، دستجردی‏کاظمی، م.، و منصوری‌زاده، م. (1390). واژه‌های پایه فارسی از زبان کودکان ایرانی. تهران: مؤسسه‌ی فرهنگی مدرسه برهان (انتشارات مدرسه).
AleAhmad, A., Amiri H., Darrudi E., Rahgozar M., & Oroumchian F. (2009). Hamshahri A  Standard Persian Text Collection. Knowledge-Based Systems, 22(5), pp. 382–387.
Baker, P., Hardie, A. & McEnery, T. (2006). A Glossary of Corpus Linguistics. Edinburgh: Edinburgh University Press.
Barnett, B., Lehmann, Hu. & Zoeppritz, M. (1986). A Word Database for Natural Language Processing. Proceedings of the 11th International Conference on Computational Linguistics COLING86.
Bullon, S. & Leech, G. (2007). Longman Communication 3000. Harlow: Pearson Longman.
Carroll, J. B., Davies, P., & Richman, B. (1971). The American Heritage Word Frequency Book. Boston: Houghton Mifflin.
Coxhead, A. (2000). A New Academic Word List. TESOL Quarterly, 34, 2: 213-238.
Davies, M. & Gardner, D. (2010). A Frequency Dictionary of Contemporary American English: Word Sketches, Collocates and Thematic Lists. London: Routledge.
Dixon, R. M. W. (1971). Method of Semantic description. In L. Verhoeven and J.H.A.L de Jong (eds.)
Dolch, E.W. (1936). A Basic Sight Vocabulary. Elementary School Journal, 36,pp. 456-460.
Fry, E. B., Kress, J. E., & Fountoukidis, D. L. (2000). The Reading Teachers Book of Lists, 4th Edition. London: Pearson Ptr.
Jones, R. L., & Tschirner, E. (2006). A Frequency Dictionary of German. London: Routledge.
Käding, F.W. (1897). Häufigkeitswörterbuch der deutschen Sprache. Steglitz: no publ.
Laufer, B. (1997). The Lexical Plight in Second Language Reading: Words You Do Not Know, Words You Think You Know, and Words You Can not Guess. In J. Coady, & T. Huckin (Eds.). Second language vocabulary acquisition (pp 20-34). Cambridge: Cambridge University Press.
McCarthy, M. (1990). Vocabulary. Oxford: Oxford University Press.
Ogden, C. K., & Richards, I. A. (1923). The Meaning of Meaning. London: Kegan, Paul, Trench, Trubner.
Meara, P. (1980). Vocabulary acquisition: A neglected aspect of language learning. Language Teaching &. Linguistics Abstracts, 13(4a),pp. 221-247. 
Milton, J. (2009). Measuring Second Language Vocabulary Acquisition. Bristol: Multilingual Matters.
Nation, P. (2001). Learning Vocabulary in Another Language. Cambridge, UK: Cambridge University Press.
Nation, P. (2006). How Large a Vocabulary Is Needed for Reading and Listening?. The Canadian Modern Language Review, 63(1), pp.59–82.
Nation, P. (2007).Teaching Vocabulary: Strategies and Techniques. New York: Thomson/Heinle.
Oxford (2008). My Oxford Wordlist. Oxford University Press.
Thornbury, S. (2004). How to Teach Vocabulary. Essex: Pearson Education Limited.
Thorndike, E. L. (1921). The Teacher's Word Book. New York: Columbia University Press.
Verlinde S., Selva T. (2001). Nomenclature de Dictionnaire et Analyse de Corpus. Cahiers de Lexicologie, 79, 2 ,pp.113-139.
Vermeer, A. (1992). Exploring the Second Language Learner Lexicon. In L. Verhoeven and J.H.A.L de Jong (eds.). The Construct of Language Proficiency: Applications of Psychological Models to  Language Assessments (pp 147-171). Amsterdam: John Benjamins.
Wilkins, D. A. (1972). Linguistics in language teaching. London: Edward Arnold.