Texas is replacing thousands of human exam scorers with AI

Shawn Knight

Posts: 15,298   +192
Staff member
In a nutshell: Students in Texas will be among the first to have state-mandated tests scored by an AI-powered platform. The written portion of the State of Texas Assessments of Academic Readiness (STAAR) exam, which gauges skill levels in reading, writing, science, and social studies, will be graded using an "automated scoring engine."

The test was redesigned in 2023. The revised exam now features fewer multiple choice questions and more open-ended questions, which are called constructed response items. The new tests have as many as seven times more open-ended questions than before.

According to the Texas Tribune, the natural language processing approach could save the state upwards of $20 million per year – money that would have otherwise been spent to hire human scorers from a third-party contractor.

Jose Rios, director of student assessment at the TEA, said they wanted to keep as many open ended responses as possible, but noted they take an incredible amount of time to score.

Machines aren't replacing human scorers entirely – at least, not yet. Last year, the Texas Education Agency (TEA) hired roughly 6,000 temporary human scorers. This year, they'll need fewer than 2,000.

A quarter of all constructed responses initially scored by AI will be reevaluated by humans, as will tests in which the computer isn't confident of its score. Responses written in a language other than English and those with slang words will also be passed along to human scorers.

The automated scoring engine was trained on 3,000 responses that first went through two rounds of human scoring. The samples allowed the AI to gauge common characteristics of responses, and instructed it on how to give the same score that a human would.

Chris Rozunick, division director for assessment development at the TEA, said they have always had very robust quality control processes with humans, and that it's similar with a computer system. Just don't call it AI.

"We are way far away from anything that's autonomous or can think on its own," Rozunick said. For example, the scoring solution doesn't "learn" from one response to the next; rather, it'll always defer to its original training as a reference.

Image credit: Pixabay, Katerina Holmes

Permalink to story:

 
All good and well, but I would like to see what is the fate of those ~20 million dollars they save...

See, capitalism and privatisation doesn't work. You are right to be sceptical, if its anything like here, budgets must be spent to keep the money coming, If an area is seen not to spend its budget then it doesnt need as much the following year, so they spend and spend, and overpay in places to get kick backs. So will they say, here is $20 million and we did some good with it ? Here is the proof etc.
 
I cant see this being exploited at ALL.
See, capitalism and privatisation doesn't work. You are right to be sceptical, if its anything like here, budgets must be spent to keep the money coming, If an area is seen not to spend its budget then it doesnt need as much the following year, so they spend and spend, and overpay in places to get kick backs. So will they say, here is $20 million and we did some good with it ? Here is the proof etc.
The exact same thing happens in socialist and communist economies. Try again.
 
Learn what words mean. The TEA is a government agency; there is no "privatization" happening here. Nor is a state-mandated test being scored by a state agency in any way, shape, or form, an example of "capitalism" at work.
Point taken, but the article does hint the human scoring was contracted out to a 3rd party presumably non-governmental entity, and we might expect the AI scoring was/will be as well.

I don't know if we'll get a study out of this but I'm curious as to how consistent / useful the human scoring could have been in the first place, and to how the machine version will compare. In the end I expect the cost savings will be the only hard data point.
 
While you can cut roles, but it also means less taxable income. And running AI is not free. So on paper it sounds like a great idea, that money either goes to upkeeping AI or paid to some vendor that offers this service. Save 20 million? Think again.
 
For the sake of budget efficiency, human exam scorers are to be replaced by AI...
for saving another budget, what else will be replaced by AI in the future...?
 
All good and well, but I would like to see what is the fate of those ~20 million dollars they save...

Something other than a problem they invented.

While you can cut roles, but it also means less taxable income. And running AI is not free. So on paper it sounds like a great idea, that money either goes to upkeeping AI or paid to some vendor that offers this service. Save 20 million? Think again.

How do you know the $20 million isn't after the cost to implement it? Are you just looking for something to be upset about?
 
Yet another thing Texas chooses to do without letting its citizens vote on it.
Eh? What nonsense in this? Every state in the union makes tens of thousands of decisions every day -- indeed every minute of every day -- without holding public referendums on each and every one. Why would Texas citizens need to vote on whether or not to save millions of taxpayer dollars on a more efficient system for standardized testing? You may be seduced by the "AI" buzzword, but this is in practice not much more than using a spellchecker algorithm to assist in grading.
 
I see this being gamed substantially by students. I did an ai marked english language test (PTE) a while ago. The internet is full of tips and tricks on what keywords to use to trick the ai into believing your response has a certain sought after structure. The content could then did npt need to be coherent at all as long as the individual sentences were structured properly.
 
Back