Practicing Analytic and Holistic Scoring Rubrics: An investigation of a Cambodian Academic Writing Class
*Mouy Eng, Sofilta Seth, Darisna Sok, Panhchaleak Sokheng, and Bophan Khan
Within the field of classroom language testing, assessment of writing proficiency is performance-based (Swartz et al., 1999) and could pose numerous challenges for the classroom test designers, i.e. teachers. A writer, i.e., student, has to generate relevant information and ideas, create his or her message linguistically, and take into account characteristics of the audiences. According to Bachman and Palmer (1996), writing is an authentic task to test students’ writing ability because writing performance on the paper most directly reflects the real knowledge abilities writers have acquired.
However, a test of writing ability could be invalid due to several interfering factors. Based on Cooper (1984), the first factor is the topic the students write about, especially when it is prescribed. Another factor which can affect students’ writing performance on a test is the genre the students are required to produce. Students could be more skillful in the genres not tested and less experienced with those tested, so it is unfair to evaluate their overall writing abilities based on performances on such a test. Other factors include time limitation, students’ health and mood, classroom environment and test anxiety. On top of these student-related issues, rater inconsistency is a common source of invalid assessment of writing. Raters or, henceforth, scorers may adopt different scoring procedures and approach students’ written works differently. Holistic scorers look at the whole picture of a writing while analytic scorers look at small elements that together make up a writing. In Cambodian ELT context, a dilemma of choosing a scoring procedure is in a widespread existence.
In light of this issue, the current study investigates Cambodian teachers’ perspectives and preferences toward the two kinds of scoring procedures, which has long been a highly contested issue among classroom test/assessment designers.
According to Klein et al. (1998), one influential factor in scoring reliability is scoring technique. In order to mark writing tasks, holistic and analytic approaches are the most common techniques. Metler (2001) defines holistic scoring as the process of scoring the whole paper once, while analytic scoring assesses a writing response according to the provided criteria and finally sum up the marks for different writing components to get a total score. Mueller (2014), similarly, explains that analytic scoring specifies students’ writing ability in pertinence to each criterion. In contrast, a holistic rubric does not list separate levels of performance for each criterion. Instead, holistic rubrics assess students’ writing generally by looking at the overall content. Mueller observes that teachers appear to have a slightly stronger preference for analytic scoring because of its reliability, and students can get specific feedback based on the criteria used.
Lumley & McNamara (1995) and McNamara (1996) state that holistic scoring cannot provide teachers with information about student’s writing because it does not focus on areas such as grammar, organization and other writing aspects. In addition, it produces a single score, which makes it less reliable than analytic scoring. A single score can be difficult to interpret for both teachers and students. Even though the teacher is supposed to assess a range of features in holistic scoring such as style, content, organization, grammar, punctuation, spelling, and more, this is not easy to do. With analytic scoring, the writing teacher provides learners with much more specific feedback, and students know better what points they have to improve. Moreover, since analytic scoring is criteria-oriented, it is easier to train analytic scorers than holistic ones. Analytic scoring, however, has its own drawbacks, one of which is the scorers’ biased judgment of students’ writing performance in one area based on their performance in another area.
At the other end of the spectrum, proponents of holistic scoring like Elbow and Yancey (1994) argue that scoring holistically could produce reliable results if the raters understand the real nature of holistic scoring. Andrea (2012) shares that holistic rubrics can be applied by multiple scorers to increase reliability since scoring holistically consumes less time than analytic scoring. With proper training holistic scorers can increase rater reliability by reducing the influence of their background, experience, mood, and other personal factors on their scoring.
Altogether, it is commonly said that scoring methods has effects on the reliability of performance test scores. Holistically, a reader makes a single, overall judgment about an answer’s quality. Analytically, the reader assigns scores to reflect how well the student respond to each of the several aspects of the question or task that should be addressed in a sample. Analytic method requires more training and scoring time than holistic method.
Both scoring procedures appear to have their own advantages and serve different purposes. Different scoring practices could influence students’ writing test performances in different ways. What techniques are commonly employed by Cambodian EFL teachers in scoring students’ writing? What are teachers’ views about analytic scoring? What are teachers’ views about holistic scoring? These are the questions the current study aims to answer.
The study was conducted to see what different perceptions of teachers on scoring writing are, and to get them to share their experiences in scoring writing. In order to obtain insightful data, semi-structured interview was conducted with three EFL teachers teaching at three different universities. The interviews took place at the participants’ convenient place and time. Each interview was led by two of the researchers, with one interviewer asking the interviewees questions, while the other took notes and monitored the interview durations. The whole interviews were transcribed verbatim and analyzed for emerging concepts and themes to explore the participants’ perspectives and suggestions on effective procedures in scoring writing.
Techniques commonly employed by the participants in scoring writing
The participant did not depend solely on one single method for several reasons. Time, the number of classes they had to teach, objectives of the assessment, and school requirement were the main factors influencing their choices of scoring methods. One of the participants interestingly mentioned that if the students can interpret the topic in the right way, he would give 50% of the score holistically and get the other 50% from analytic scoring. Another interviewee reported that it would depend on the purpose of the writing test. If specific elements of writing were focused on, he would employ analytic scoring because it would enhance students’ learning and writing based on certain criteria and provide clear directions for students’ improvement. He also added that analytic scoring is more critical than holistic. The third participant believed that if he had only one or two classes, he would use the analytic scoring approach because it would provide clear feedback on students’ strengths and weaknesses in writing. Like the second participant, the third interviewee argued that students could improve their own writing by taking all of those feedbacks into consideration and make changes for their future writing. On top of that, the participants all agreed that time would be the determiner of their scoring methods and expressed a stronger preference for analytic scoring if time were abundant.
The participants’ views about analytic scoring
Analytic scoring is objective and critical. To correct students’ writing analytically, specific criteria are required to be established so that the teacher keeps in mind the exact elements to pay close attention to. For instance, 20% is for each grammar, spelling, word use, coherence and cohesion. One participant thought that students can find out their weak and good-at areas that way; hence, being scored in this method, students can get direct feedback which they need for improving their writing. Nevertheless, when it comes to real marking, there are a few disadvantages using analytic scoring, too. Analytic scoring can be too objective when it focuses on many small unnecessary components of students’ writing. For another participant, correcting specific elements can be discouraging for students because they might make a lot of vocabulary or grammatical mistakes. They may perceive that they are not good at writing, so there is a chance that they will give up. Additionally, analytic scoring method is time-consuming. Busy teachers would never utilize analytic scoring to mark their students. It requires a lot of time and, mainly, most of the students would fail if analytic scoring is used.
The participants’ views about holistic scoring
Holistic scoring is subjective, and the teacher just looks at students’ interpretation, overall performance or whole idea of the topic. If they, regardless of grammatical or vocabulary errors, can convey their ideas in the writing, they would get high score. The advantage for teachers using this technique is that they can save a lot of time. They do not need to spend time overlooking all the errors in the writing, but overall view of students’ writing. Therefore, lots of students manage to pass the test, but they do not get much of corrective feedback for their further improvement. More interestingly, it does not reduce students’ confidence in their writing, causing them to write more. Nonetheless, it is not fair for everyone, the participants thought. One of them mentioned that students may write differently in terms of format, grammar use, word use or organization, but with holistic scoring, they may end up receiving a similar score regardless of their errors with discourse. In addition, it affects students’ writing ability in the future since they do not get much feedback for their paper; thus, they would find it hard to develop their writing.
In real practice, the participants had no single preference since they always took both methods into account. School requirement, students’ low proficiency, and lack of time and budget were the reasons for their mixed practice. They, however, would prefer analytic technique to holistic scoring as holistic scoring, they reported, may not be reliable enough if done without experience or expertise. A good piece of writing comprises not only content but also organization, word use, grammar structures and proper styles. The participants thought that analytic scoring is more reliable because it takes better care of those elements.
Teacher-student conferencing was suggested by the participants as an aid to help teachers who prefer using either holistic scoring or analytic scoring mark their students essay more effective. It is more helpful for holistic scoring since teachers do not have chance to comment or give feedback on students’ paper directly. Teacher-students conferencing provides opportunities for teachers to further elicit mistakes and weaknesses of the students’ writing, and for teachers to collaborate with students in finding ways to revise the written works.
The participants’ actual scoring practice
The primary goal of this study is to understand the participants’ perceptions on scoring writing which could help learners to improve their writing. The findings suggest that analytic scoring outweighs holistic scoring not only in making scoring writing efficient but also improving students’ future writing skills.
The first participant claimed that he frequently used holistic scoring because he had to mark many students’ papers from many classes despite his personal preference for analytic rubric since it gives reliable results and he can give detailed feedback on the papers. In addition, the participant contended that analytic scoring helps students a lot to develop their writing better since it shows the students their weak points which they have to improve. He added that analytic scoring promotes writing and makes students learn more from their own writing. The third interviewee, however, frequently used analytic scoring rather than holistic scoring although the method consumed a lot of her time and added burdens to her already overwhelming workloads. She believed that it is a helpful strategy to make students learn better in a writing class. This finding is in agreement with Mueller (2014), who states that analytic rubric is more common because teachers typically want to correct students’ writing separately, specifically on different criteria. Plus, paying more attention to specific criteria of students’ performance allows teachers to give better feedbacks to students, and they can recognize the areas they need improvement.
The study findings, overall, suggest that even though analytic scoring was thought to be more useful than holistic scoring, it may not be applicable in every classroom situation in reality. Teachers who teach far too many classes, like participants 1 and 2, cannot spend marking students’ writing by looking at and commenting on every small writing element. In contrast, holistic scoring works well with this condition. Therefore, different methods work well in different situations.
Furthermore, students’ preferences for the types of teacher feedback determine teachers’ choice of scoring procedures. For example, analytic rubric can discourage some students and reduce their interest in writing because of presence of red ink being marked all over their writing. Semke (1984) explained that students do not like to see their writings covered all over with the red ink. Instead, this kind of practice may result disappointment and discouragement as the students fail to see encouragement for their writing efforts.
However, for some other students, comments and feedback are very important to them, and they wish to see those comments on their paper rather than not. Diab (2005) suggested that students are concerned not only with the content and whole picture of writing, but also with accuracy and perceive attention to linguistic errors as effective feedback from teachers. More importantly, the instructors and students seem to agree that error correction is necessary since they see it as a “security blanket”, and it is what students expect to get for revision to develop their writing skill. In other words, students would not know what to do or correct and how to start revising unless they see the red marks on their papers.
One more thing that can influence selection of scoring methods is the course objective and school curriculum. If the curriculum mainly focuses on content areas, teachers could be better off using holistic method because the students may care more about the content and whole meaning of the writing. In contrast, if the course syllabus pays more attention to specific writing components separately, analytic rubric will probably work better.
The study shows that it is probably wise not to claim that analytic scoring is better than holistic scoring and vice versa because they have their own advantages, and teachers have different personal preferences. However, no matter which method teachers use, they can add an extra technique to ensure that students obtain optimal benefit from what they have written and the teachers’ scoring methods. Integration of the two scoring methods plus teacher-student conferencing feedback would be an interesting area for further investigation.
Bachman, L. F. and Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford
Cooper, P. L. (1984). The assessment of writing ability: A review of research. An ETS research report. Princeton, NJ: Educational Testing Service.
Diab, R. L. (2005). Teachers’ and students’ beliefs about responding to ESL writing: A case study. TESL Canada Journal, 23 (1), 28-43.
Elbow, P. & Yancey, K. B. (1994). On the nature of assessing writing: An inquiry composed on e-mail. Assessing Writing , 91-107.
Klein, S. P., Stecher, B. M., Shavelson, R. J., MacCaffrey, D., Ormseth, T., Bell, R. M., Comfort, K., Othman, A. R. (1998). Analytic versus holistic scoring of science performance tasks. Applied Measurement in Education, 11 (2), 121-137.
Lumley, T. & McNamara, T. (1995). Rater characteristics and rater bias: Implications for
training. Language Testing, 12(1), 54-71.
McNamara, T. (1996). Measuring second language performance. London: Longman.
Mueller, J. (2014). Authentic assessment toolbox. Retrieved September 10, 2014 from http;//jfmueller.faculty.noctrl.edu/toolbox/rubrics.htm#howmany
Semke, H. D. (1984). Effect of the red pen. Foreign Language Annals, 17(3), 195-202.
Swartz, C.-W., et al. (1999). Using generalizability theory to estimate the reliability of writing
scores derived from holistic and analytical scoring methods. Educational and Psychological Measurement, 59(3), 492-506.
(*Mouy Eng is a Deputy Academic Manager for Chinese Language Program at Westline Education Group. She got an MA in TESOL from Royal University of Phnom Penh in 2015. Through her seven-year experience teaching English and Chinese, she has developed a strong interest in researching and improving English and Chinese language education for Cambodian students. Sharing and doing professional development are her favorite things in life.)