Imagine 1,100 men and women in a cavernous convention center, laboring 8 to 5, seven days straight with no days off. Does that sound like a penal labor camp, or an enterprise beloved by teachers, who return year after year?
Scoring Advanced Placement tests is the yearly activity that only a participant could love — and love it we do. Don’t try to explain why to others, because they will simply not understand.
Part of the appeal is our entry into a zone where there are no days of the week — just days of the reading. You arrive not on a Friday, but just before Day One. Is it Saturday? Sunday? No one keeps track, except to announce, “Day Four! Halfway there!”
We score packs of 25 pink booklets in yellow folders and know we’ve read 50 or 250 (we’re not quite sure). As leader of one of the questions, I see statistics that fill the gaps in our collective memories. Day Four of this year’s AP Literature reading, for instance, saw 1,077 readers completing 187,676 essays on that day alone. By the end of Day Seven, we had polished off nearly 1,035,000 essays written by 345,000 students.
With so many readers, and over a million essays, how can the College Board guarantee that each one receives the score it deserves? The answer is all in the samples. Five days before the reading starts, several of us choose essay samples of each score point. We compose accompanying explanations, based on the scoring guide, and spend a day with the 42 table leaders on each of the three questions. Those table leaders, in turn, spend Day One training their seven or eight readers to apply both the scoring guide and sample essays.
The scoring is consistent year to year, table to table, reader to reader. Forty-five percent of the overall score is based on a difficult multiple-choice exam, further stabilizing the 1-to-5 AP score. Statistics show multiple-choice/free-response (essay) correlations to table leaders, alerting them to the rare inconsistent reader.
What is the multiple-choice/free-response (MC/FR) correlation? If the correlation were 1:1, there would be no need for essays, and AP scores would be based on multiple-choice questions alone. The correlation measures how high or low scores on the standardized part of the test relate to the essay score. A low correlation makes no sense; few students do very well on one part and poorly on the other.
Generally, table leaders are happy if the MC/FR correlation is 50 percent: half the time the student scoring better on MC, and the other half better on FR. It’s complicated — but the calculations give credibility to the enterprise. No college would give credit for scores deemed unreliable.
The AP test remains the only standardized test I’ve ever trusted — both for the intelligent way it’s composed, and the way it’s scored. Somewhere along the line, magic occurs, as well, as hundreds of AP readers spend a week in minimal comfort to give student essays the fairest reading possible. Their conscientiousness and care guarantee the program’s longevity — so we can return next year, and feel the magic once again.
Erica Jacobs, whose column appears Wednesday, teaches at George Mason University. E-mail her at [email protected].