نوع مقاله : مقاله پژوهشی
نویسندگان
1 نویسندۀ مسئول، استادیار گروه آموزش زبان فارسی به غیرفارسیزبانان،دانشگاه بین المللی امام خمینی(ره)، قزوین. ایران.
2 استادیار گروه آموزش زبان فارسی به غیرفارسیزبانان، دانشگاه بین المللی امام خمینی(ره)، قزوین. ایران.
3 دانشیار گروه آموزش زبان فارسی به غیرفارسیزبانان،دانشگاه بین المللی امام خمینی(ره)، قزوین. ایران.
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
One of the basic and necessary steps in teaching Persian to Non-Persian speakers is collecting and recording linguistic data and preparing the Corpus Linguistics of Non-Iranian Learners of Persian and describing its data using linguistic theories. The aim of this study was to take a step towards creating a Non-Iranian Persian learners’ Corpus. The linguistic data of the research for preparing the desired corpus is taken from the final writing test of Chinese learners of Persian at Persian language teaching center at Imam Khomeini International University (90 Chinese students’ level A2 and 36 Chinese students’ level B2) and the syntactic labeling based on category and Scale grammar is done manually. Nine grammatical labels including: sentence, clause, rank clause, rank-shift clause, finite clause, non-finite clause, verbal group, nominal group (complement and predicate), and adverbial group are recorded in the writing of the students. The corpus consists of a total of 126 written texts including 212 paragraphs, 29857 words, 3175 sentences, 4912 clauses, and 19369 groups. These groups are 4912 verbal, 8760 nominal, and 4912 adverbial groups (including adjective and prepositional groups). The research also confirms the effectiveness of Bateni's descriptive grammar, which is based on Category and Scale grammar, in the syntactic labeling of writings of Chinese learners of Persian.
Extended Abstract:
Teaching Persian to Foreigners (TPF) is at the beginning of its ups and downs; therefore, one of the basic and necessary steps is to collect and record raw linguistic data and prepare a corpus for Non-Iranian Learners of Persian (CNLP) and describe its data using linguistic theories. The present study is the result of an in-university research project that has been carried out with the support of Imam Khomeini International University (IKIU) of Qazvin to take a step towards creating a CNLP, identifying and resolving potential problems and meeting some of the needs of researchers.
The research is based on the book describing the grammatical structure of the Persian language based on the theory of Category and Scale grammar (CSG). In the CSG, four categories have been discussed. These four categories are "unit", "structure", "class" and "system". "Unit" and “structure" belong to the syntagmatic axis, which represents the sequence of the constituent or elements of language over time, while "class" and "system" belong to the paradigmatic axis, which represents a variety of possibilities at each point in the speech chain for the speaker to choose from.
The corpus of the research is taken from one of the final writing tests of the General and supplementary Persian language courses of the Persian Language Teaching Center (PLC) at IKIU. 90 Chinese Persian learners at the general level and 36 Chinese Persian learners at the supplementary level participated in the mentioned test; hence, a total of 126 test sheets were used as raw data.
To prepare the corpora, the writing sheets of Chinese Learners of Persian (CLP) were first typed in the Microsoft Word software. Attempts were made to type as much as possible what the CPLs had written in their composition. Then, the grammatical tagging of the typed content was done within the framework of CSG. At this stage, 9 grammatical tags such as sentence, clause, ranked and rank shifted clause, finite and non-finite clauses, verbal group, and nominal and adverbial groups were recorded in the writings of CLPs.
Since the dots are intended as the boundary between the end of one sentence and the beginning of another, the punctuation has been revised by scholars and, if necessary, corrected or supplemented. Next, the work of identifying and separating the sentences has been done. While analyzing corpora, the components of the rank-shifted clauses are identified and calculated as the constituent elements of the clause (verbal group, nominal group, and adverbial group). In the analysis of nominal groups with other nominal dependents, only the main nominal group is considered. Adverbial groups are also identified as a unit; this means that nominal groups are not labelled separately within adverbial groups. Also, due to the subject pronoun dropping feature of Persian, in a significant number of sentences of CLP's writings, the subject is not specified in the form of a noun group. In the labelling of the corpus, an attempt was made to analyze the components of the text by Persian Learner's writings and the written text to be labelled without applying linguistic corrections as much as possible.
The CPLs at the general level were students who participated in the PLTC at IKIU for 16 weeks and 20 hours per week for the four skills of listening, reading, speaking and writing skills. So, they participated in a total of 320 training hours in face-to-face classes. Considering the quality and quantity of the educational program and the individual characteristics of CPL, the GPLC can be considered equivalent to the pre-intermediate level (A2) in the Common European Framework of Reference for Languages (CEFRL). The CLPs at the supplementary level were students who participated in the PLTC at IKIU for 32 weeks and 20 hours per week for the four skills. So, they participated in a total of 640 training hours in face-to-face classes. Considering the quality and quantity of the educational program and the individual characteristics of CPL, the GPLC can be considered equivalent to the upper-intermediate level (B2) in the (CEFRL)
The most important achievement of the research is the preparation of the initial version of the CNLP with the characteristics that will be mentioned below: A total of 126 writings of CLPs were used as raw data for the CNLP at two levels (90 writings at the general level and 36 writings at the supplementary level). Therefore, the corpus is composed of a total of 126 written texts including 212 paragraphs and 29,857 words. Also, the corpus contains a total of 3175 sentences, 4912 clauses, and 19369 groups (including 4912 current groups, 8760 noun groups and 4912 adverb groups including adjectives and preposition groups). The study proves the effectiveness of CSG in accurately describing the CLP's writings.
کلیدواژهها [English]