My history students recently tested the potential of several leading large language models (LLMs) to do the work of Advanced Placement U.S. History essay graders. The models were provided with College Board scoring guidelines and were prompted to score sample College Board U.S. History exam responses. For each essay, the models made decisions on awarding… Continue reading Evaluating Large Language Models as AP Essay Scorers