Readers may recall the controversy over the difficulty of the June 2003 New York State Regents Math A exam. The results of the exam were tossed for juniors and seniors, and a panel was appointed to study what went wrong. For reference, here are links to Commissioner Mills's earlier press release and the charge to the Math A panel. Also for reference, my critique of the New York State Regents Math A exam.
The Math A Panel has now produced an interim report, and it is receiving plenty of press attention. (Go to Google News and do a search on 'regents "math a"'.) The best summary that I've seen is that of Karen Arenson in the New York Times.
The panel's interim report deals with only a very limited part of the charge, and deals with it in a disappointingly limited way. The panel clearly thought it was important to have a recommendation out before the start of the school year about a rescaling of the test. I am surprised that they only found the time to compare the June 2003 and the June 2002 instances; in the 6 weeks that they've worked they really might have had a serious look at, say, the past 6 instances of the exam, and this both in a qualitative and a psychometric way. Who knows, maybe the June 2002 exam was exceptionally easy.
In fact, though, the conclusions of the panel regarding the difficulty of the June 2003 instance match all the informed speculations that I've seen, including my own speculations: Parts 1 and 2 of the exam were in line with previous instances, and parts 3 and 4 were more difficult. For my critique I looked at August 2002, January 2003, and June 2003; and found June 2003 the hardest and January 2003 the easiest.
The interim report does not specifically criticise any officials or any actions, but I draw from it the conclusion that inexcusable errors were made in the development of this June, 2003, instance of the exam. In my earlier commentary I quoted an article by David Hoff in Education Week in which he quoted deputy commissioner James Kadamus as saying that the June, 2003, exam had more problem-solving questions than previous exams, because the state is gradually raising its expectations. I wrote then that this is a remarkable statement, because all previous reports indicated that the added difficulty of the June exam was unintended and had taken the Department entirely by surprise.
Now here is Karen Arenson, writing on the basis of the interim report of the Math A panel:
Based on field tests before the actual test was administered, the Education Department expected the average score on the June test to be 46. The expected average for the test given a year earlier was 51 slightly higher, but still below the score needed to pass, which is 65 for students who entered ninth grade in 2001 or later, and 55 for everyone else.
Did commissioner Mills know that the average scaled score of the June, 2003, exam was expected to be 5 points lower than that of June, 2002? (Arenson is mistaken, of course, to describe 51 as "slightly higher" than 46; the difference is large.) Public indications are that Mills did not know this.
I am still surprised that the error of the added difficulty was made in such a blatant way. For myself I had been speculating that a subtle error would have been made: the department might have used for its psychometric evaluation of the difficulty of the test a rather different population of students than the population that really matters. They might have had a test population with lots of bright 9th and 10th graders, and perhaps for that group the difficulty of the June 2003 exam was in line with earlier instances, while for the struggling seniors the added "problem solving" (i.e., aptitude oriented) focus of the exam would have posed more severe problems. But apparently the department did not make a subtle error; they were just completely wrong and out of control.