Sunday, May 31, 2020

The SAT Quandary

A month has past since the first glimpse of the new SAT. The buzz has abated and will intensify again once details are disclosed on April 16th. In an attempt to improve the silence, I share here what Im seeing, and not seeing.Though highlights of changes have been suitably covered in the media, responding discourse has mostly circumvented what should be our central concern. Despite change the SAT has undergone over the years, what the test is has always mattered less than what the test does. Rather than anticipate what the new test will contain, we should more scrutinize what the new test will reliably do: face validity versus predictive validity. If the former were to supplant the latter, the tests relevance in college admissions would decline.Face validity is when a test feels on its face to measure what is deemed important. Psychometrically, this is the weakest form of evidence; ironic, given the SAT’s move toward testing one’s ability to anchor answers to evidence. That a test seems to measure something doesnt mean it does. However, face validity is important because it affects public attitudes toward a test. Sound familiar?Predictive validity is when a score positively correlates with criteria measured in the future. The SAT is only relevant if colleges can rely on its predictive validity. If the new test became less assuring than its ancestors of this promise, its meaningfulness would taper, but we’d need years to know that for sure. Meanwhile, colleges will accept it on its face, or not.Predictive validity is pertinent because of what we didnt hear, for once, on March 5th. Previously, we heard that the new SAT would improve its differentiation at its tails. Given what I know about test construction, I was skeptical. Because as more is asked of the SAT, might it do fewer things well? As it soon flexes to do more by ostensibly testing less, no less something will have to give. And until we see countervailing evidence, I sense the tes ts optics will diminish.You cant get finer scores than the questions you have available. Try to imagine a test that at once improves its differentiation and focuses on fewer things those few things weve been sententiously told, matter most. Or a test that opens more doors of opportunity and transforms possibilities for everyone and anyone and still makes meaningful distinctions across a broad range of applicants. The test is shortening, its breadth narrowing, and its questions may take longer to answer (restricted use of calculators, evidence-supportedanswers, more complex equations and analysis, more reading). And without deductions for wrong answers, the range of possible raw points shrinks further. This redesign is a radical dream and a psychometric nightmare.So while much focus will be on how these changes will treat test takers, the bigger issue is how they might trammel the test makers. Shorten the test, narrow its focus and perhaps reduce the number of possible raw outcomes can they really do all this and improve the test’s performance? Or, is this a shift in utility? A surrendering of a once high stakes differentiator, recast as a capstone exam; a test that pushes out rather than pulls in?And while the noble mission of delivering opportunity to under-represented students is well stated, I fear unintended consequences. The original SAT was meant to reveal latent potential in unlikely places, to discover talent in students who dont necessarily have classical educations. Over time, the test was said to have lost its way. However, might new changes only further undermine that original goal? It is said the new test will be modeled on work of our best classroom teachers and the most rigorous course work. Who will that most benefit? Might the test become even more yielding to the most advantaged students and even more futile for those most hindered?If so, then will the test distinguish between two otherwise outstanding applicants as well as it does n ow? While unpopular to discuss out loud, selective colleges find the conspicuous difference of 300 SAT points between two otherwise comparable applicants consequential. If that difference is obscured, the test becomes less useful to its consumers.And the footing gets even more dangerous with the tests handling of sub-populations. That the SAT correlates with family income, for example, has not been a design flaw its been a usage flaw. If the new test were engineered to break that correlation it would earn a pyrrhic victory on its way to defeat. Despite what some think, and even claim, colleges dont exactly need a test on which sub-populations do better. Rather, they want a test that better identifies which individuals of a sub-population will most likely succeed. A sudden rise in scores among sub-groups is no promise that it will. And if traditionally underperforming groups further underperform, well, then thats even a bigger problem.To be fair, weve been assured that previous scor es will map to new scores, which implies the normalcy of the bell curve will remain apparent. But the alignment of scores, given the narrowing of the test, could involve a smushing of scores a side effect that refutes earlier indications that the SAT will soon get better at distinguishing among students.For all its limitations, the SAT has endured because it reliably distributes, by design, symmetry of outcomes every year. What that distribution means is another matter. Changes to the test can crown different winners but it cant crown more winners unless it becomes a test of mastery. And a test of mastery will have either a negative or positive skew, take your pick. AP Exams and Subject Tests exemplify this.And this is where President Coleman seems understandably torn: his background is in mastery but the mission of the SAT has long been differentiation. Can he yoke these competing ideals? Perhaps to a degree, but that would redefine the SAT and would be a gamble I am unsure member colleges are prepared to take.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.