Abstract
Graph-structured data provides a fundamental representation for modeling complex systems, from social networks to molecular interactions. Learning effective representations from such data is therefore a central problem in machine learning. While Graph Neural Networks (GNNs) have emerged as the dominant paradigm for this task, achieving strong performance across node, edge, and graph-level predictions, they face significant limitations in expressiveness, adaptability, and generalization. Specifically, standard architectures often struggle to capture complex substructures—such as cycles and motifs—that are critical in scientific domains. Moreover, many models are tailored to specific tasks or distributions, limiting their ability to generalize across heterogeneous graph types. In parallel, Large Language Models (LLMs) have revolutionized AI with their remarkable reasoning and generalization capabilities. As graph data increasingly incorporates rich textual and semantic information, integrating LLMs into graph learning pipelines offers a promising avenue for enhancing representation learning. However, the inherently sequential nature of LLMs poses significant challenges for modeling structured, non-Euclidean graph data, making their direct application non-trivial. This dissertation advances modern graph learning by systematically investigating the interplay between GNNs and LLMs across four key directions. First, we enhance GNN expressiveness by introducing distance-aware architectures and leveraging high-dimensional Weisfeiler–Lehman tests. Second, we explore how to incorporate LLMs to design both a unified graph classification framework, capable of addressing diverse tasks within a single model, and a graph–language co-training paradigm that jointly learns structural and semantic representations in a self-supervised manner. Next, we develop methods that enable LLMs to directly reason over large-scale graphs with high accuracy and efficiency. Finally, we present a case study on signaling pathway inference from scRNA-seq data using the proposed techniques, highlighting the synergy between accurate graph modeling and interpretability. Collectively, this work bridges graph structure and language-based reasoning, contributing toward a unified framework for expressive, generalizable, and scalable graph learning.
Committee Chair
Chenyang Lu
Committee Members
Chongjie Zhang; Fuhai Li; Jiayin Jin; Yixin Chen
Degree
Doctor of Philosophy (PhD)
Author's Department
Computer Science & Engineering
Document Type
Dissertation
Date of Award
4-15-2026
Language
English (en)
DOI
https://doi.org/10.7936/2hev-1s89
Recommended Citation
Feng, Jiarui, "Towards Modern Graph Learning: from Graph Neural Networks to Large Language Models" (2026). McKelvey School of Engineering Graduate Student Theses & Dissertations. 1382.
The definitive version is available at https://doi.org/10.7936/2hev-1s89