A Corpus-Based Analysis of Orthographic Errors of Arabic Persian Learners at the Advanced Level in the Framework of Persian Script Instruction

Document Type : Research Paper

Authors

1 PhD candidate of Linguistics, Islamic Azad University, South Tehran Branch, Tehran, Iran.

2 Corresponding author, PhD, Assistant Professor of English Language Teaching and Translation Department, Islamic Azad University, Karaj Branch, Karaj, Iran.

3 PhD Graduate in General Linguistics, Persian Language Instructor, International College of Tehran University of Medical Science, Tehran, Iran.

10.30479/jtpsol.2024.19654.1658

Abstract

In recent decades, a growing number of non-Persian speakers have been visiting Iran, eager to gain a deeper understanding of Iranian culture and civilization. Simultaneously, there has been a notable influx of Arabic-speaking students enrolling in Iranian universities. Therefore, in this research, an attempt has been made to examine the Persian writing structure of advanced-level Arabic language learners from the corpus-based perspective. The study aims to discern spelling and technical errors in their written works, adhering to the writing guidelines established by the Persian Language and Literature Academy. To achieve this objective, a corpus consisting of 151 Persian essays from Arab learners from Iraq, who underwent the Persian proficiency test at Imam Reza International University, was compiled. The corpus consists of 3234 types and 32096 samples. Utilizing the Wordsmith software, word frequencies were extracted from the essays. Employing content analysis, the study scrutinized various error types, including spelling/stylistic errors, semi-space deletion errors, space deletion errors, space insertion errors, cross-typing, and separate writing. The findings highlighted that the most prevalent errors were in the category of space insertion, followed by semi-space deletion errors, spelling/stylistic errors, and spelling errors. Separate writing exhibited the lowest occurrence, with no errors identified in the category of space deletion. The analysis revealed that these errors predominantly originated from intra-language factors, specifically second-language learning and developmental errors.
Extended Abstract:
Introduction
In recent decades, we have been encountering many non-Persian speakers travelling to Iran and gaining more knowledge about Iranian culture and civilization. Notably, in 2018, before the Coronavirus outbreak, Iraqis surpassed visitors from all other nations in their travels to Iran, a fact substantiated by statistics from the Ministry of Cultural Heritage, Tourism, and Handicrafts.
Beyond tourism, there has been a notable influx of Arabic-speaking students enrolling in Iranian universities. To accommodate this, various centers dedicated to teaching Persian to non-native speakers are actively engaged. These centers focus on developing proficiency in the four essential language skills-reading, listening, writing, and speaking as well as imparting knowledge of Persian script.
The teaching of Persian script is driven by the dual purpose of honoring the essence of the Persian language and safeguarding its official script status in Iran. Recognized as the official language and script of the country, the Persian script holds significance not only for Iranians but also plays a crucial role in presenting the history of Iranian culture and civilization to foreigners. According to the fifteenth article of the Constitution of the Islamic Republic of Iran, the Persian script is mandated for all official documents, correspondences, and textbooks. This necessitates the establishment of clear rules to ensure uniformity and preservation of the Persian script's integrity. The Academy of Persian Language and Literature, fulfilling its duty to protect the Persian language and script, has issued a comprehensive set of guidelines titled "Persian Script Instructions" to regulate and maintain the essence of the Persian script.
This research delves into an analysis of orthographic errors made by 151 non-native Persian language learners, specifically Iraqi participants in the Persian language proficiency test conducted by Imam Reza International University. This examination is conducted within the framework set by the Persian Language and Literature Academy.
Considering the contemporary emphasis on objective linguistic data in research and the pivotal role of corpora in linguistic studies, this research adopts the use of linguistic corpora for data analysis. Recognizing the effectiveness of corpora in second language teaching, this approach enables language teachers to observe and assess the learning process of language learners.
 
Methodology
­­­­This research focuses on 151 non-native Persian language learners, primarily from Iraq, who took part in the Persian language proficiency test at Imam Reza International University. The aim is to assess their proficiency in Persian for pursuing Master's degree studies.
In the next step, the compositions have been typed. Then, to enable computer processing, the formatted typesetting file was converted to a text file through the web-based software Espoz. After that, using WordSmith corpus analysis software, a list of words was prepared, including the frequency of each word along with its percentage and also the dispersion of each word in the entire text was determined. Then, these words and their frequencies were analyzed.
Categorized as corpus-based research, this study utilizes data extraction from a linguistic corpus, employing Wordsmith corpus analysis software. The self-motivated method was employed for data collection. The corpus comprises 32,096 samples of Persian writings by Arabic language learners, encompassing 3,234 types. It's noteworthy that the calculation of types only pertains to vocabulary, excluding punctuation marks. The distinctive feature lies in the distance inserted on both sides, setting this research apart from other forms.
 
Results
 In alignment with Modarres Khiabani's research (2016), this study examines errors in six categories: spelling/stylistic errors, insertion of space, deletion of space, deletion of half-space, cross-typing, and separate writing based on the Persian script and correct word forms.
Considering that by spelling error we mean that the intended word is spelled incorrectly, spelling errors of Persian learners were extracted from the corpus, which were examined in two areas phonetic (consonant and vowel) errors and calligraphic errors. In the phonetic error domain, we are faced with 421 samples, which is a relatively significant frequency. Incorrect usage of signs such as Hamza, and Tanwin and the use of Arabic-derived words by Arabic-speaking Persian learners resulted in 586 spelling errors that were examined in the calligraphic error group.
The stylistic error refers to cases where although the intended word is not misspelled in terms of writing, it is irrelevant to the formal style of the specific test we are considering, which here is the Persian language proficiency test. We encountered 221 instances in the corpus.
In the conducted examinations, two types of spacing errors in the learners' corpus were observed, due to the insertion of space before punctuation marks such as "dot," "comma" and "colon" as well as the insertion of space between compound words. The frequency of errors resulting from the insertion of space before these punctuation marks is 1459, 136, and 5 cases, respectively, in which the error related to the insertion of space before the dot is considerable.
Regarding the deletion of space error, no space has been inserted between two independent words. No cases of space deletion error were observed in the Persian learner corpus, and learners strongly tended to insert spaces between all words in their writing.
Since the written versions of Persian learners are handwritten, examining this type of error is difficult. However, among all the errors examined in the corpus, the error of deleting half-space is the most frequent. 1583 instances of errors indicate that Arabic-speaking Persian learners do not pay attention to the use of half-space in compound words and after some inflectional affixes such as "mi-" or "-ha" and "-haye".
Cross-typing error refers to cases that should be written separately but are written together, which we encountered with 138 errors in the corpus.
Separate writing error refers to cases that should be written together but are written separately, which we encountered with 53 errors in the corpus.
Analysis of each error type within the Persian language learner corpus reveals the highest frequency in the insertion of space category, followed by deletion of half-space, spelling/stylistic errors, and spelling errors. Interestingly, no errors were found in the field of space deletion.
Categorizing errors based on Zia Hosseini's classification, two types are considered: interlingual errors, stemming from the interference of the Persian learner's mother tongue, and intralingual errors, resulting from incorrect or incomplete learning of the target language. The study underscores that writing errors are significantly influenced by interlanguage-origin errors, highlighting the impact of learning and developing a second language on the prevalence and nature of these errors.
Conclusion
 The findings reveal the highest frequency of errors in inserting a space and removing a semi-space, indicating a potential lack of training on issues related to proper spacing before punctuation marks and the use of semi-spaces in Persian. Notably, the advanced-level Farsi-Arabic learners exhibit a high frequency of spelling and stylistic errors, underscoring the need for more precise teaching points and additional training exercises. While half-space omission errors are more frequent than spelling/stylistic errors, the latter holds greater importance in the realms of communication and comprehension. Despite the more frequent occurrence of half-space omission errors, the impact of spelling/stylistic errors on communication and understanding elevates their significance. Errors associated with transliteration and separate writing rank third, highlighting the complexity of these topics. This suggests a pressing need for more detailed training and increased practice for Farsi-Arabic language learners in these specific areas.
Considering the results of this research and since the Persian writing system is an alphabetic system, it seems that phonological awareness, as a component, has a significant impact on reading and writing in this language.
Based on the identified errors, it is expected that revisions will be made to Persian teaching resources, and exercises and classroom activities will be designed to reduce the errors of Persian learners.
In addition to using instructional resources, regular and periodic evaluations can be a fundamental step towards facilitating the learning process and reducing errors.
 
Conflict of Interest
There was no conflict of interest in this research.
 Acknowledgment
Appreciation goes to all the people who helped us in doing this research, in particular Dr. Jalal Rahimian, who is a professor of linguistics in the Department of Foreign Languages and Linguistics at Shiraz University.
 

Keywords

Main Subjects


References:
Afshinpoor, M., et al., (2018). The Lexical syntagmatic errors among Arabic speaking Farsi learners. Language Science, 8 (5), 7-30. [In Persian]
Ahmadi, T., et al., (2021). Providing a suitable method for allophonic labeling of speech corpuses of according to the IPA system. Scientific Journal of Language Research, 13 (38), 185-212. [In Persian]
Ahmadvand, A., (2010). An orthographic error analysis of German learners of Persian at the elementary level (Master Dissertation). Shahid Beheshti university, Tehran, Iran. [In Persian]
Asadollahi, Kh., & Azarnivar, L. (2022). Pathology of Writing and Editing of Civil Law texts based on Persian Grammar. Rhetoric and grammar studies, 11(20), 287-315. [In Persian] 
Ashoori, D., (1986). Some suggestions about writing method and Persian script. Publication of knowledge, 36, 2-8. [In Persian]
Assi, M., (2000). From language corpora to corpus linguistics. Iranian Linguistics Conference. [In Persian]
Atar Sharghi, N., (2014). Analytical review of internal and external approaches to the linear regime change in Persian: Avrvfarsy case, ionic inquiry Parseek. Language Related Research, 34 (7), 143-174. [In Persian]
Azarang, A., (2004). Cultural, educational, and communication policies, and reforming the Persian script. Persian Letter, 3 (9), 5-14. [In Persian]
Bahmanyar, A., (1942). Persian spelling, recommended to the Academy of Persian Language and Literature. Nameh Farhangestan, 4, 42-66. [In Persian]
Baker, P. Hardie. A. & McEnery, T. (2006). A Glossary of Corpus Linguistics. Scotland: Edinburgh University Press.
Behtooei, M., (2001). The style of Persian calligraphy. Cheesta Magazine, 182-183, 174-184. [In Persian]
Borjiyan, H., (1993). Tajikistan's experience in changing the Persian script. Iranology, 17, 172-180. [In Persian]
Brezina, V.  & Gablasova, D. (2018). The corpus method. available at https://eprints.lancs.ac.uk/
Corder, S. P. (1967). The Significance of Learners’ Errors. International Review of Applied Linguistics in Language Teaching. 5, 161-170.
Farshidvar, Kh., (1971). A discussion about the Persian script and suggestions to standardize it, Vahid, 9 (9), 1318-1330. [In Persian]
Hong, H., (2011). An analysis of the syntactic errors of Vietnamese Speakers in learning Persian (Master Dissertation). Allameh Tabatabaei university, Tehran, Iran. [In Persian]
James, C. (1998). Errors in language learning and use: Exploring error analysis. England: Pearson Education.
Kazemi Moosavi, A., (1990). A suggestion to improve Persian script. Iran nameh, 33, 151-158. [In Persian]
Keshavarz, M.H. (1994). Contrastive analysis and error enalysis. Tehran: Rahnama Publications.
Khalili Ardali, V. et al., (2017). Persian Writing in the most viewed websites in Iran from a technical editing point of view. Quarterly Journal of Resesrch in Persian Language and Litreture, 46, 69-98. [In Persian]
Khatami, F. et al., (2019). Comparison of vowel and consonant insertion errors in speech of intermediate and advanced Iraqi students. The first national interdisciplinary conference on Iran studies, linguistics and translation studies, Valiasr University. 314-334.
Madani, M., (1976). Persian script problems. Kaveh Journal, 62, 5-20. [In Persian]
Matbooee Bonab, M. (2007). The analysis of written errors of English-speakers who lear Persian at the elementary level (Master Dissertation). Shahid Beheshti university, Tehran, Iran. [In Persian]
McEnery, T. & Wilson, A. (2001). Corpus Linguistics: An Introduction, United Kingdom: Edinburgh University Press.
McEnery, T. Xiao, R.  & Tono. Y.  (2006). Corpus-Based Language Studies. London: Routledge.
Mirdehghan, M. et al., (2014). Written Errors of German-speaking Learners of Persian at Elementary Level: An Orthophonemic Analysis. Journal of Teaching Persian to Speakers of Other Languages, 6(3), 91-116. [In Persian]
Mirzaei Hesariyan, Mb., Pooladestoon, H., (2020). Grammatical Analysis of chinese Persian learner's Writings (A2): a Study Based on Category and Scale Grammar. Journal of Teaching Persian to Speakers of Other Languages, 9(20), 115-136. [In Persian]
Modarress Khiyabani, Sh., (2007). The investigation of lexical collocation in Persian (PhD. Dissertation). Allameh Tabatabaei university, Tehran, Iran. [In Persian]
Modarress Khiyabani, Sh., (2017). A pathology of orthography of subtitles at IRINN news network, and IFilm TV network: a Corpus-Based study. Quarterly Scientific Journal of Audio-Visual Media, 27, 32-61. [In Persian]
Mohammed Ebrahimi Jahromi, Z. & Nayeri Fallah, N., (2021). The current errors source of beginner, intermediate and advanced Persian learners. Literary Research, 18 (71). 123-150. [In Persian]
Motavallian Naeini R, dehkhoda Z. (2018). Spelling Error Analysis of Arab Learners of Persian Language. Language Related Research, 42 (8), 233-264. [In Persian]
Natel Khanlari, P., (1963). Linguistics & Persian language. Tehran: Iranian Culture Foundation. [In Persian]
Parvan, H. & Sarkar Hassankhan, H., (2018). Using Text Corpora in Teaching of German Language. Journal of Foreign Language Research, 2 (8), 449-474. [In Persian]
Pornorouz, M., & Hosseiny, A., (2018). The status of the Persian script and its possible changes, According to mental spaces. Scientific Journal for linguistic and literary criticism studies, 6-7, 105-121. [In Persian]
Sadr Amirjanloo, A., (2002). Shortcomings of Persian script and its consequences in teaching Persian language. The Journal of the Faculty of Literatures and Humanities of Tehran University, 2-3 (49), 393-408. [In Persian]
Safari, S., (2016). Learner corpora: basics, methodology, design and production pattern. The articles collections of the 2nd International Corpus Linguistics Conference. Tehran: Nevise Parsi Publishing, 93-123.  [In Persian]
Sajjadi, SH., & Sahraei, R., (2019). Restrictive and Non-Restrictive Relative Clauses in Persian Language: Sequence of Acquisition by Non-Iranian Persian Learners. Journal of Teaching Persian to Speakers of Other Languages, 8 (18), 119-136. [In Persian]
Shoaei, M., (2012). Some solutions for the challenges facing Persian language script. Quarterly Journals of Persian Language and Literature, 11 (4), 87-103. [In Persian]
Sojoodi, F., (2012). Why should we not change the Persian script. Retrieved from https://www.isna.ir/news/.911114077867 [In Persian]
Sotude, H. & Honarjooyan, Z., (2014). The Study of Persian Writing Style Variations and their Impacts on Information Retrieval: The case of Hamshahri Corpus. Library and Information Sicence Journal, 66, 31-49.  [In Persian]
Tabatabaei, E., (2014). An Analysis of Writing Errors of Arab Learners of Persian: Compiling a Persian Learner Corpus for Intermediate Arab Learners of Persian (Master Dissertation). Ferdowsi university, Mashhad, Iran. [In Persian]
Taheri, Kh., & Gowhari, H., (2021). Information structure and thematic structure of Persian language, a corpus-based analysis. Scientific Journal of Research Language, 13 (38), 213-242. [In Persian]
The Academy of Persian Language and Literature, (2022). Persian Script Instructions. Tehran: Asar publication.
Zakeri, M., (2013). About changing the Persian script. Mah-e-Adabiyat.193, 54-61. [In Persian]
Zandi, B., (1999). Persian conversation teaching. Tehran: Research and planning office. [In Persian]
Zia, E., (2001). The style of Persian calligraphy. Cheesta Magazine, 190, 794-797. [In Persian]