[Part 2: Score Discrepancies is the second of a three part report on the new PSAT. See Overview, Part 1: Percentile Inflation, and Part 3: Lowered Benchmark. The entire report can also be downloaded or distributed as a PDF.]
Part 2 : Score Discrepancies
An historically narrow gap between sophomore and junior performance does not seem credible and leads to questions about how scoring, scaling, and weighting were performed and reported.
Sophomore Versus Junior Score Discrepancies Call Scoring Methodologies into Question
Percentile inflation caused by redefinition and re-norming creates unfortunate misinterpretations, but the sources of the change can be readily identified; previous percentile tables can be restated based on the new definition; the difference between Nationally Representative percentiles and User percentiles can be compared to gauge the difference added there. However, without further information from College Board it is impossible to know the accuracy of the 11th and 10th grade percentiles. Our analysis shows that there are significant problems in the way the numbers are being presented that mask the very thing the new test was meant to reveal — college readiness and academic progress. If score results between grades are suspect, it leads to questions about the pilot studies that were performed and how they inform the scoring for the PSAT and SAT.
Expected Versus Observed Score Differences Between Grades
Historically, juniors have outperformed sophomores on the PSAT/NMSQT by approximately 5 points per section [see table below]. Translated into SAT scores, the differences between 10th and 11th graders in 2014 were 48 points, 47 points, and 51 points in Critical Reading, Writing, and Math, respectively. On the new PSAT, however, the reported difference is only 12 points on Evidence-Based Reading and Writing (EBRW) and 19 points in Math. The average difference in 2014 is more than 3 times that seen in 2015. The 2014 grade differences were in line with those seen over the last decade, so they were not anomalous. The old and new PSAT are different tests, but student growth tends to show up similarly even on different college admission exams.
Are Low Score Discrepancies Due to Differing Testing Populations?
Not all sophomores and juniors take the PSAT. Some take the PSAT as mandatory testing; some take the PSAT in order to qualify for National Merit; some take the ACT Aspire instead of the PSAT. If College Board’s calculation of a nationally representative sample is correct, though, this year’s grade differences should be immune from differences in test-taker demographics. Previous PSATs lacked a nationally representative sample, so sophomore to junior comparisons may be distorted by test-taker patterns. A way of removing potential distortion is to look at the results only for repeat testers — students who took the test in both school years. College Board has done research on the typical score change on the old PSAT by analyzing only students who took the test as sophomores and repeated the test as juniors [see table below]. The average increase, expressed in SAT points, was 33 points in Critical Reading, 33 points in Writing, and 40 points in Math. The figures are still twice what is being shown on PSAT reports as the 10th grade to 11th grade score differential.
Do Content Differences Between Old and New PSATs Provide an Explanation?
A remaining problem is that the old PSAT is not the new PSAT. Although the new and old tests cover roughly the same score range and do not have radically different means or standard deviations, we cannot be certain that year-over-year growth is identical. A third set of data is College Board’s own estimates of growth. Below are the College and Career Readiness Benchmarks.
College Board assumes that students improve at roughly 30 points from sophomore year PSAT to junior year PSAT and another 20 points from junior year PSAT to SAT. The PSAT figures — which themselves seem conservative — are still twice that shown in the 2015 student data.
Percentile Data for Sophomores and Juniors May Prove the Existence of Errors in Presentation, Computation, or Norming
The low observed score differences between 10th and 11th graders do not fit into a historical pattern, match studies of repeat testers, or align with assumed College Board benchmark progress. As improbable as the small point discrepancy is, though, it seems impossible to go one step further and state that sophomores outperform juniors. But this is exactly what the published percentile tables show [below].
As you move up the scale, the difference between 10th and 11th graders disappears and then turns in favor of the younger students. Read literally, the score tables say that more sophomores than juniors achieved top scores on the PSAT/NMSQT. There have always been talented sophomores who score highly on the PSAT, but as a group, these students should not do better on the PSAT in 10th grade than they do in the 11th.
These figures are for the Nationally Representative groups, so cannot be explained away by saying that the test-taking populations are different. There is no logical statistical or content explanation as to how sophomores could actually perform better than juniors. In fact, we should be seeing scores 30-50 points higher per section for juniors. The most likely explanation is that the surveying and weighting methods used for the PSAT did not properly measure the class year compositions. If we assume this to be the case, though, can we be assured that the studies did any better in measuring the intra-class composition? Will the SAT be immune from the same problems?
Can Anything Explain the Low Sophomore/Junior Score Differences and the Score Inversion?
A suspect in the mix is the PSAT 10. Although the content of the PSAT 10 is identical to that of the PSAT/NMSQT, it is positioned as a way for schools to measure how students perform near the end of the sophomore year rather than toward the outset of the year. The PSAT 10 will first be offered between February 22 and March 4, 2016. It is a safe assumption that spring sophomores, adjusted for differences in the testing pool, will score higher than fall sophomores. If College Board statistically accounted for PSAT 10 takers in their figures, the scores for sophomores would be inflated.
It seems academically inappropriate to lump PSAT/NMSQT and PSAT 10 scores into the same bucket. The tests are taken at different phases of a student’s high school progress. In fact, one reason a PSAT 10 exists is because spring performance differs from fall performance. The only clue that College Board may have made such a combination is reproduced from its Understanding Scores 2015. Highlighting has been added.
It’s likely that this reference is simply the result of a production error. The document never makes this reference again in its 32 pages. In short, all figures likely measure October performance for sophomores and juniors. This final attempt to explain the anomalous supremacy of sophomores comes up short. Even had a PSAT 10 explanation proved successful, it would have raised more questions than it answered.
Tables surrounding the PSAT are all marked as “Preliminary.” College Board has made clear that final scaling for the redesigned SAT (and the PSAT is on the same scale) will not be completed until May 2016. Final concordance tables between old and new tests will replace any preliminary work. If the explanation of the statistical anomalies is that the paint is not yet dry, it begs the question as to what 3 million students and their educators are to do with the scores they have been presented. The new PSAT reports are the most detailed that have ever existed. They have total scores, section scores, test scores, cross-test scores, sub-scores, Nationally Representative percentiles, User percentiles, SAT score projections, sophomore and junior year benchmarks, and more. Which parts of the reports are reliable and which parts remain under construction? Should educators simply push these reports aside and wait until next year? Should students make test-taking and college choice decisions based on these scores?