Submission Date

5-8-2026

Document Type

Paper

Department

Computer Science

Adviser

William Mongan

Committee Member

William Mongan

Committee Member

Molly O'Rourke-Friel

Department Chair

Nicholas Scoville

Project Description

Providing individualized feedback to math students is a resource-intensive bottleneck in STEM education. We present Mentir-AI, a tool designed to elevate teacher capacity by generating high-quality mathematical feedback using the Mathforum's "Problem of the Week" archive. By analyzing a corpus of nearly one million interactions, we compare the efficacy of Retrieval-Augmented Generation (RAG) and Fine-Tuning (FT) architectures. This study details the development of an automated grading pipeline, the evolution of a multi-component system prompt, and the implementation of an automated mentor grading system in AI-led evaluation. While Fine-Tuning demonstrates superior instructional judgement, our results identify persistent failure modes in mathematical accuracy and pedagogical judgement. Consequently, we propose a shift from one-shot prompting to an agentic architecture to partition mathematical reasoning from pedagogical drafting.

Comments

Grant funding provided by the Gates Foundation.

Share

COinS