Submission Date
5-8-2026
Document Type
Paper
Department
Computer Science
Adviser
William Mongan
Committee Member
William Mongan
Committee Member
Molly O'Rourke-Friel
Department Chair
Nicholas Scoville
Project Description
Providing individualized feedback to math students is a resource-intensive bottleneck in STEM education. We present Mentir-AI, a tool designed to elevate teacher capacity by generating high-quality mathematical feedback using the Mathforum's "Problem of the Week" archive. By analyzing a corpus of nearly one million interactions, we compare the efficacy of Retrieval-Augmented Generation (RAG) and Fine-Tuning (FT) architectures. This study details the development of an automated grading pipeline, the evolution of a multi-component system prompt, and the implementation of an automated mentor grading system in AI-led evaluation. While Fine-Tuning demonstrates superior instructional judgement, our results identify persistent failure modes in mathematical accuracy and pedagogical judgement. Consequently, we propose a shift from one-shot prompting to an agentic architecture to partition mathematical reasoning from pedagogical drafting.
Recommended Citation
Cummins, Michael J., "The One-Shot Ceiling: Comparing RAG and Fine-Tuning Architectures for AI-Assisted Math Mentoring" (2026). Computer Science Honors Papers. 2.
https://digitalcommons.ursinus.edu/comp_hon/2
Included in
Artificial Intelligence and Robotics Commons, Educational Technology Commons, Science and Mathematics Education Commons
Comments
Grant funding provided by the Gates Foundation.