Data Description of Writing Recount Text Skill
Figure 4.8 Scatter Plot of Writing Skill between the Two Raters
Furthermore, the interpretation of the linearity between the two raters given by scatter plot represented in Figure 4.8 above is corroborated by a
numerical method, i.e. ANOVA of the data between the two raters. The detail result of ANOVA of the writing skill deriving from the two raters is provided in
Table 4.14 below:
Table 4.14 ANOVA
b
of Writing Skill Data between the Two Raters
Model Sum of Squares
df Mean Square
F Sig.
1 Regression
2668.597 1
2668.597 27.983 .000
a
Residual 2288.738
24 95.364
Total 4957.335
25 a. Predictors: Constant, Writing_Rater_2
b. Dependent Variable: Writing_Rater_1
Table 4.14 above reveals that F-test value obtained is 27.983 with level of significance or p-value at 0.000. Because the p-value is lower than 99 level of
confidence p0.010=0.0000.010, it is interpreted that the regression model between the two raters are considered linear.
b. Test of Normality Distribution The normality distribution of writing recount text skill data of the two
raters are investigated through graphical method and numerical method. 1 Graphical Method
In terms of graphical method, the Q-Q plot is employed to examine the normality distribution of writing recount text skill data between the two raters.
Figure 4.9 and Figure 4.10 below present the Q-Q plots of the two raters:
Figure 4.9 Detrended Normal Q-Q Plot of Writing Skill of Rater 1
Figure 4.10 Detrended Normal Q-Q Plot of Writing Skill of Rater 2
Figure 4.9 indicates that there are two subjects considered as extreme cases 7 and 8 because they are found to locate more than three standard
deviations from the mean. Nonetheless, through doing some inspections of their test results with Rater 2 Figure 4.10 shows that subject 7 and 8 are still within the
accepted standard deviation from the mean as well as in CT tests, in which they may be regarded to consistently do well in the tests, they cannot be considered as
outliers and justifiably deleted from the analysis. In this case, idiosyncratic phenomenon in which Rater 1 tends to give higher scores on the two subjects may
be regarded as the ground causing it occurs.
2 Numerical Method The numerical method is employed to assure the interpretation of
normality distribution through the graphical method that is previously conducted. In this case, the result of numerical method by using the Shaphiro-Wilk test is
presented in Table 4.15 below:
Table 4.15 Shaphiro-Wilk Test of Writing Skill from the Two Raters
Shapiro-Wilk Statistic
df Sig.
Writing_Rater_1 .833
26 .001
Writing_Rater_2 .878
26 .005
Based on Table 4.15 represented above, the test shows that the asymptotic significances of the writing data sets of the first and second raters obtained are
lower than 99 level of confidence p
1
0.010 = 0.001 0.010 and p
2
0.010 = 0.005 0.010. In other words, the two data sets are not considered normally
distributed. By taking account of the results of the test linearity and normality
distribution of writing skill data, the non parametric test, i.e., Spearman’s rho, is preferred to be employed to investigate the inter-rater reliability. It is because
although the data is considered linear both based on graphical method and
numerical method, the normality distribution test seems not to have any consistency i.e., in this case, based on the skewness and kurtosis results, the data
are considered to be normally distributed, but as these are investigated through a graphical method as well as numerical method through saphiro-wilk test, the data
are found to be not normally distributed. Table 4.16 below provides the result of the inter-rater reliability between the two raters examined through Spearman’s
rho:
Table 4.16 Spearman’s rho of Inter-rater Reliability between the Two Raters
Writing_ Rater_1
Writing_ Rater_2
Spearmans rho
Writing_Rater_1 Correlation Coefficient
1.000 .741
Sig. 2-tailed .
.000 N
26 26
Writing_Rater_2 Correlation Coefficient
.741 1.000
Sig. 2-tailed .000
. N
26 26
. Correlation is significant at the 0.01 level 2-tailed.
Based on Table 4.16 represented above, the Spearman’s rho ρ for the inter-rater reliability obtained is 0.741 which is significant at 99 level of
confidence p 0.01 and considered to have a high relationship see Table 3.4 in Chapter III for the correlation coefficient interpretation. Hence, it is regarded that
the data of writing skill rated by the two raters are considered interchangeable.
c. The Final Score of Writing Recount Text Skill The final score of writing recount text skill is obtained through calculating
the average between the two raters. The result of the final score of the writing recount text skill is depicted in descriptive statistics provided in Table 4.17 as
follows:
Table 4.17 Descriptive Statistics of Final Score of Writing Recount Text Skill
N Valid
26 Missing
Mean 52.88
Std. Error of Mean 2.653
Median 50.00
Mode 50
Std. Deviation 13.528
Variance 183.012
Skewness .663
Std. Error of Skewness .456
Kurtosis .226
Std. Error of Kurtosis .887
Range 50
Minimum 33
Maximum 83
Sum 1375
Percentiles 25
41.67 50
50.00 75
60.42
Based on Table 4.17 represented above, the central tendency distribution of final score of writing recount text skill data of the 26 eleventh grade students of
MA Khazanah Kebajikan academic year 20152016 is indicated by the median, mode, and mean. First, the median obtained is 50.00. Meanwhile, with the mean
of 52.88 and the mode of 50, most of the students’ writing recount text skill is considered to be under the average score.
Moreover, the dispersion distribution of final score of writing recount text skill data is shown by the range, standard deviation, skewness, and kurtosis. In
this case, the range between the maximum of 83 and minimum of 33 is 50. With standard deviation of 13.528, the skewness and kurtosis obtained are 0.663 and
0.226 respectively. These skewness and kurtosis are converted to their ratios, i.e.,
1.453 and 0.255 respectively. Based on the skewness ratio of 1.453 and kurtosis ratio of 0.255, the data of writing recount text skill is considered normally
distributed because these two scores are still within the reasonably accepted scores, i.e., -2 and 2.
To further make sure the normality distribution of the final score of writing recount text skill data, a graphical method and numerical method are
employed. 1 Graphical Method
The Q-Q plot employed to examine the normality distribution of final score of writing recount text skill data is presented in Figure 4.11 as follows:
Figure 4.11 Detrended Normal Q-Q Plot of Final Score of Writing Skill
Based on the Q-Q Plot of final score of writing skill represented in Figure 4.11 above, the data can be considered not normally distributed because two
subjects, i.e., 7and 8, are found to be out of the accepted range, i.e., three standard deviations from the mean. It may occur due to the fact that one of the raters’ data
also reveal, through Q-Q Plot, subject 7 and 8 locate out of the three standard deviations from the mean see Figure 4.9.
2 Numerical Method The numerical method through Shapiro-Wilk test presented in Table 4.18
below reveals similar result to the skewness and kurtosis ratios, which indicate that the data of final score of writing recount text skill are normally distributed.
The Shaphiro-Wilk test reports that that the asymptotic significances of the data of final score of CT test obtained is higher than 99 level of confidence
p0.010=0.0490.010, so the data can be considered to have a normal distribution.
Table 4.18 Shapiro-Wilk Test of Final Score of Writing Recount Text Skill Data
Shapiro-Wilk Statistic
df Sig.
Final_Score_Writing .922
26 .050
Regardless the graphical method represented in Figure 4.11 shows that the data of final score of writing recount text skill is not normally distributed, the
numerical methods comprising statistics of skewness and kurtosis ratios and Shapiro-Wilk test points out that the data is normally distributed; therefore, the
data of final score of writing recount text skill can be concluded to have a normal distribution.