While this is not the first time the SAT has changed, there is something unprecedented about this particular overhaul: the motive behind it. As more is asked of the SAT, might it do fewer things well? As it gets stretched to do more, will something have to give? With the first official data sets from the redesigned PSAT published earlier this month, the promise of the new test appears to remain unmet. Test results introduce unexplained anomalies that have not existed before and unusual scoring patterns and inconsistencies in the released data imply either errors in methodology, weaknesses in test design, or both.
Once upon a time the SAT was a college admissions exam, and nothing more. That changed when the idea of a redesigned SAT was introduced a few years ago.
Proponents embraced the bold vision of a more relevant exam better aligned with high school curriculum while skeptics questioned the new test’s ability to produce predictively valid results. Could the new test be trusted to do what it was previously meant to do?
The popular narrative is that the SAT changed because it was losing out to the ACT. But losing how, exactly? Losing where? Until very recently, the SAT was not losing much market share at the student level but rather at the state-contract level. It was not so much the competitive college applicant market that was favoring the ACT but rather the institutional buyers looking for a test that could serve a range of needs.
College Board’s new president had recently arrived with celebrated ties to Common Core standards and understandable interest in the lucrative state testing market associated with it. It was therefore almost inevitable that the organization’s flagship product would not only be redesigned but also reimagined and repurposed. A once high stakes college admissions differentiator sold individually would now transform into a suite of standards-based assessments sold by the thousands.
That is when the test’s fate became torn: its future became tied to assessing standards but its mission had long been student differentiation. Could it yoke these competing ideals? Perhaps to a degree, but not by merely redesigning itself; it would have to reposition and redefine itself as well. And parts of that redefining process are now validating concerns raised three years ago. We were assured that the new test would have better transparency, demand greater rigor, and promote the importance of evidence. So far, those pledges have fallen short.
For starters, most students have received score reports that inflate their actual performance in one or more ways, but a lack of explanatory context around the data has caused misinterpretation of the results. Secondly, benchmarks have been quietly redefined and lowered resulting in a dubious spike in the number of students deemed “college ready”. And finally, useful informational detail about newly implemented research methods, particularly pertaining to the use of research study samples to determine percentiles, has not yet been disclosed.
Compass Education Group has conducted an extensive analysis of student and counselor PSAT reports; student, parent, and counselor feedback; and all pertinent tables and publications provided by College Board thus far. The result of this research is a thorough explanation of the concerns raised above. The rationale for these critiques is spelled out in great detail. Specific examples of irreconcilable or unexplained data are provided throughout the full downloadable report. (A web version of the report can be found here.)
The Compass report’s findings include:
1) A hypothetical and less competitive measuring stick, along with an inflationary definition of percentile, was discreetly introduced and featured as the primary indicator of a test taker’s relative standing. The additive effect results in as much as a 10% tailwind on a given student’s apparent relative standing.
2) Unexplained score differences between sophomores and juniors are narrower than ever before and in conflict with College Board’s own predictions for expected gains from one year to the next. At the high end of the range, sophomores appear to be outperforming juniors on the whole.
3) Benchmark definitions have been revised and linked to a lower college GPA while benchmark thresholds themselves have also been shifted lower, causing the predicted pass rate to soar above those of competing assessments. The new 12th grade benchmark is 60 points lower than the previous 10th grade benchmark. The new ELA benchmark is readily achieved by random guessing.
To be fair, tables surrounding the PSAT are all marked as “Preliminary.” But if the explanation of the statistical anomalies is that the paint is not yet dry, it begs the question as to what three million students and their educators are to do with the scores they have been presented. The new PSAT reports are more detailed than ever. Which parts of the reports are reliable and which parts remain under construction? Should educators simply set these reports aside for now? Should students not make test-taking and college choice decisions based on these scores?
The new SAT’s debut is just weeks away, and many of its components are being built on the same research studies and with the same methods used for the PSAT. It would seem prudent to establish credibility with PSAT data now rather than play catch-up after final SAT numbers are released.
While College Board long ago outgrew its official name — College Entrance Examination Board — the PSAT and SAT are still in transition from an earlier time. While the public may still think of the PSAT and SAT primarily as college admission tests, the exams are being tasked with an increasing number of duties — assessment, alignment, benchmarking, merit, gatekeeping, placement— for an increasing number of students and educators. States such as Connecticut, Michigan, Illinois, and Colorado have recently made expensive commitments to the SAT because of the promise of the redesign. Later this year, thousands of colleges will start receiving the new scores and slotting them into millions of applicant files (along with ACT scores). There is no slow build in standardized testing — change happens all at once. The inevitability of the new test, though, does not mean that College Board need only pay attention to operational aspects. The rapid rise in public funding for the PSAT and SAT and the increasing number of tasks for which the exams claim competency require an increased level of scrutiny, accountability, and transparency.
The Compass report (web version here) identifies examples of the myriad ways in which the redesigned tests will have to find their way. Implementing an entirely new college admission exam is a Herculean undertaking, and College Board set ambitious delivery goals for both rollout and validity. Efforts to meet deadlines must not undermine the original impetus for the redesign. Trust in the new exam should not come from the fact that its name reuses three letters; students and colleges deserve an SAT that judges fairly and openly.