Evaluating ESG Scoring Consistency of Large Language Models Using Retrieval Augmented Generation Methods

Authors

  • Gaargi Bora Atholton High School, Maryland, United States
  • Sashrika Gupta West Windsor-Plainsboro High School South, New Jersey, United States
  • Dr. Sudip Gupta, PhD John's Hopkins University Carey Business School, United States

DOI:

https://doi.org/10.14738/assrj.1212.19750

Keywords:

Environmental, Social, Governance (ESG), Artificial Intelligence (AI), Financial Institutions, Retrieval-Augmented-Generation (RAG)

Abstract

This study evaluates how large language models (LLMs) determine environmental, social, governance (ESG) scores, utilizing retrieval-augmented-generation (RAG) procedures. Three LLMs–Claude-4, ChatGPT-4o, and Gemini-2.5–were used to find the environmental (E), social (S), and governance (G), scores for a total of nine different, publicly traded companies of three different sizes (small, medium, and large) based on market capitalization. The scores of four companies (Morgan Stanley, Goldman Sachs, Berkshire Hathaway, East West Bancorp) were found using one set of prompts, and the other five (BlackRock, PNC, Bank of America, American Financial Group, and GreenDot) were found using a separate set of prompts, utilizing the same criteria and methods with changes in wording. Both sets of prompts, or each trial, found that RAG approaches produce more stable scores that were more consistent with their existing scores found by established rating agencies (Morningstar Sustainalytics, S&P Global, and JUST Capital). These findings suggest that LLMs demonstrate greater consistency in measuring ESG performance when using RAG methods and providing structured data and criteria, exemplifying the growing capabilities and prospective viability for LLM-determined ESG ratings.

Downloads

Published

2025-12-23

How to Cite

Bora, G., Gupta, S., & Gupta, S. (2025). Evaluating ESG Scoring Consistency of Large Language Models Using Retrieval Augmented Generation Methods. Advances in Social Sciences Research Journal, 12(12), 218–234. https://doi.org/10.14738/assrj.1212.19750