Abstract
High-entropy alloys (HEAs) are a relatively recent class of materials formed by alloying five or more elements in near-equimolar ratios. Their promising mechanical and thermal properties make them promising for a wide range of applications, however, the vast compositional space presents a challenge for efficient exploration. As a result, robust and structured databases are critical for guiding HEA research and accelerating materials discovery. In this project, a computational pipeline was developed to leverage large language models (LLMs) for extraction of compositional data and relevant properties from existing HEA literature. The program integrates structured prompt design, parallel execution, and deterministic post-processing to compile mined data into one large database. Performance evaluation demonstrated high consistency across repeated queries, and the full pipeline achieved a total runtime of under 1.5 minutes to process fifty papers. This work demonstrates the feasibility of using LLM- assisted workflows to rapidly construct materials databases from unstructured literature, offering a scalable approach for future materials informatics and data-driven alloy design efforts. Future work will focus on incorporating additional validation mechanisms and creating an interactive website to view the compositional data.
Document Type
Final Report
Class Name
Mechanical Engineering and Material Sciences Independent Study
Language
English (en)
Date of Submission
12-31-2025
Recommended Citation
Mellin, Ruth E., "Implementation of Machine Learning and Large Language Models for High Entropy Alloys" (2025). Mechanical Engineering and Materials Science Independent Study. 309.
https://openscholarship.wustl.edu/mems500/309