Investigating the Reliability of Cet- Reading Test Using Winsteps Rasch Model
Autor: Sara17 • February 7, 2018 • 2,587 Words (11 Pages) • 657 Views
...
Methodology
In a country like China with rich historical heritage, where assessment is long accepted as a fair indicator of students’ academic success (Cheng, 2008) and most likely influence their career choice, therefore it is an absolute to evaluate and improve the usefulness of such a high stake test such as the CET. The CET’s test items are designed to serve the dual function of criterion-related and norm referenced testing has also been challenging for setters and test takers alike which made it open to contention among academics. Hence, this study aims to use the Winsteps which apply the Rasch model to look into the reliability and the components which may affect the reliability of the test. The sample population taken is 74 participants and 20 items in the reading test to get an overview of the test. The Rasch model summarises the total score based on a person's projected expected performance on a variable which is calculated by comparing two people’s independence of which items may be used within the set of items assessing the same variable. Hence, the Rasch model could be employed to act as a criterion: to meet the Rasch requirements of invariance of comparisons, instead of
---------------------------------------------------------------
First 2-3 words of Title 7
just a statistical description of the responses to assess for construct validity which is closely linked to reliability (Bachman & Palmer, 1996). Moreover, a Rasch analysis could provide insights on whether the scores are justified in the data through a process of fitting the data to the model. If there is a high level of invariance of responses across diverse groups of people, then it can be concluded that the test is unreliable and the scores cannot be justified as a fair representation of the test-taker’s proficiency. Bryce (1981) critiques that the data will never fit the model perfectly and emphasized that deliberation of the fit of data to the model with respect to the uses to be made of the total scores are made.
Results
[pic 2]
Table 1
From Table 1, the separation index is 1.30 and 2.51 for persons and items respectively. From this set of data, the low item separation of 2.51 implies that the items can only be separated into 2 to 3 levels of difficulty and the test-takers’ ability could not really be discriminated into finer substrates, hence the test probably has only one standard which is a pass or fail, much like a driving test. This is in accordance of the
CET’s purpose which is to separate out the undergraduates who do or do not meet the
---------------------------------------------------------------
First 2-3 words of Title 8
level of English Language proficiency needed for their further studies. Hence it is also a pass or fail test which explains the low level separation index.
The reliability index is 0.63 and 0.86 for persons and items respectively. Bachman & Palmer (1996) explains that the reliability index is an indication of how dependable or consistent, a test measures a variable. In another words, if the test is retaken, will the score be similar?
.
Looking at the low CET person reliability index of 0.63 means the CET discriminate the sample population 1 or two levels which is in accordance to the low person separation of 1.30. The low reliability of implies that the person sample might not large enough to confirm construct validity of the assessment.
Hence, in order to increase the reliability coefficient is to widen the ability range by increasing the sample. Alternatively, improvements could be made to the Sample-item targeting.
---------------------------------------------------------------
First 2-3 words of Title 9
[pic 3]
Figure 1
Looking at Figure 1 which shows the matching of the difficulty of the items against the ability of the test-taker, it is observed that there are no items targeting at the high ability between index 3-4 as well as the middle band 0-1. Hence by providing a wider range of item-person matching, the separation index will increase meaning that the test will be able to discriminate more levels of difficulty and proficiency which will lead to a higher reliability coefficient.
--------------------------------------------------------------------------------------------
|ENTRY
TOTAL
TOTAL
MODEL| INFIT | OUTFIT |PTMEASUR-AL|EXACT MATCH|
|
|NUMBER
SCORE
COUNT
MEASURE
S.E. |MNSQ
ZSTD|MNSQ
ZSTD|CORR.
EXP.| OBS%
EXP%| ITEM
|
|
------------------------------------
+----------
+
----------
+-----------
+-----------
+-------
|
|
12
20
74
2.00
.29|1.48
2.8|1.92
3.6|A-.08
.41| 66.2
77.5| item42|
|
15
...