Helping accounting educators use AI effectively

July 8, 2025 By Troy Turner

Harbert-led research team wins national best paper award for their investigation of how well ChatGPT performed on accounting case assignments.

Harbert accounting faculty authors of the award-winning paper are (left to right starting with top row): Travis Holt, Kerry Inger, Greg Jenkins, Tina Loraas, Jefferson Jones, Jonathan Stanley, Mollie Mathis and James Long. Fellow co-author Xu "Joyce" Cheng (not pictured) was formerly an Auburn accounting faculty member; she is now a professor at the University of Nevada, Las Vegas.

A team led by Auburn University researchers with the Harbert College of Business recently received national recognition for its work on a paper entitled: “Artificial Intelligence’s Capabilities, Limitations, and Impact on Accounting Education: Investigating ChatGPT’s Performance on Educational Accounting Cases.”

The American Accounting Association honored the team with its 2025 Issues in Accounting Education Best Paper Award.

Members of the team included Xu (Joyce) Cheng (University of Nevada, Las Vegas), Ryan T. Dunn (Troy University), Travis Holt, Kerry Inger, J. Gregory Jenkins, Jefferson P. Jones, James H. Long, Tina Loraas, Mollie E. Mathis, Jonathan Stanley and David A. Wood (Brigham Young University).

How it began

“This paper grew out of another unique accounting research project,” said Long, who serves as Harbert’s assistant dean of Global Programs and as the Amy B. Murphy Endowed Distinguished Professor. “David Wood at BYU had the idea to have a number of professors around the country run their exam questions through ChatGPT to see how well it could perform on accounting-specific questions.

“This was a really cool paper because it brought together more than 300 coauthors to generate an enormous data set of 28,085 ChatGPT answers to accounting exam questions,” Long said. “Overall, we determined that students significantly outperformed that version of ChatGPT on these questions, but for about 16% of the questions, ChatGPT outperformed the students.”

Several members of the Auburn-led team worked on this initial paper.

“Although it provided a substantial body of evidence about how well ChatGPT did on accounting exam questions, we didn’t feel like students would have access to ChatGPT for most of the exams that we give in accounting. Instead, we felt that students would be more likely to employ ChatGPT to help with out-of-class assignments, including case studies,” said Long.

“The Auburn School of Accountancy is currently ranked No. 1 in the world for educational accounting case research, so it made sense for our faculty who had authored educational accounting cases to tackle this topic. We included David Wood from BYU because he had the initial idea for the original project. It was a great team and the authorship experience couldn’t have gone better,” he added.

Looking to help instructors

The team’s first goal was to determine whether the state-of-the-art ChatGPT model could perform well on the types of assignments that the faculty traditionally give outside of class.

As noted in the paper, the researchers found that ChatGPT’s ability to provide accurate solutions varied depending on the type of case requirement, with better performance on tasks requiring elements that require explanation, application of rules, and ethical evaluation using a framework.

“‘However, ChatGPT performed relatively poorly on tasks that require financial statement creation, journal entries, or software use. Our study also found that detection tools provided by ChatGPT’s developer are ineffective in identifying text created by artificial intelligence text generators (AITG).

“‘These quantitative results, although limited in generalizability, illustrate the current ‘state of the art’ and allow us to suggest ways in which instructors can structure assignments to reduce the effectiveness of AITGs in subverting the learning process and ways in which instructors can incorporate AITGs into assignment requirements to help students attain desired educational outcomes,’" the authors wrote.

If you can’t beat it...

“Unfortunately,” Long said, “that snapshot of the ‘state-of-the-art’ is already stale, despite our paper being in print for only a little more than a year. We noted overall improvement from ChatGPT 3.5 to 4.0 and additional versions of ChatGPT have been released since we concluded our study.

“The anticipated performance gains with each new generation suggest that it will soon likely have the ability to outperform the average student on a whole host of assignments. Thus, it feels inevitable that it will be a dominant part of the educational technology landscape in the not-too-distant future, if it isn't already.”

Long said he and his fellow researchers wanted to help educators think through how to use AI productively in the classroom rather than suggest trying to ban or prevent something that:

probably can’t be entirely prevented (ChatGPT's own AI detection tool failed to effectively distinguish between student and AI text, and ChatGPT pulled it shortly after our study was completed), and
to the extent that AI will be used in practice, students will need to understand how to employ this technology productively.

“Accounting is a practical discipline, and efficiency and effectiveness are critical measures of success,” Long said. “To the extent that AI and large language models can help accounting professionals deliver efficient and effective solutions, they will add tremendous value to the profession.”

###

Learn more about the Harbert College of Business School of Accountancy

Search overlay

Search form

People

Programs

Events

Helping accounting educators use AI effectively

Harbert-led research team wins national best paper award for their investigation of how well ChatGPT performed on accounting case assignments.

How it began

Looking to help instructors

If you can’t beat it...