Date of Award
Doctor of Philosophy (PhD)
The past few years have seen the growing prevalence of deep neural networks on various application domains including image processing, computer vision, speech recognition, machine translation, self-driving cars, game playing, social networks, bioinformatics, and healthcare etc. Due to the broad applications and strong performance, deep learning, a subfield of machine learning and artificial intelligence, is changing everyone's life.Graph learning has been another hot field among the machine learning and data mining communities, which learns knowledge from graph-structured data. Examples of graph learning range from social network analysis such as community detection and link prediction, to relational machine learning such as knowledge graph completion and recommender systems, to mutli-graph tasks such as graph classification and graph generation etc.An emerging new field, graph deep learning, aims at applying deep learning to graphs. To deal with graph-structured data, graph neural networks (GNNs) are invented in recent years which directly take graphs as input and output graph/node representations. Although GNNs have shown superior performance than traditional methods in tasks such as semi-supervised node classification, there still exist a wide range of other important graph learning problems where either GNNs' applicabilities have not been explored or GNNs only have less satisfying performance.In this dissertation, we dive deeper into the field of graph deep learning. By developing new algorithms, architectures and theories, we push graph neural networks' boundaries to a much wider range of graph learning problems. The problems we have explored include: 1) graph classification; 2) medical ontology embedding; 3) link prediction; 4) recommender systems; 5) graph generation; and 6) graph structure optimization.We first focus on two graph representation learning problems: graph classification and medical ontology embedding.For graph classification, we develop a novel deep GNN architecture which aggregates node features through a novel SortPooling layer that replaces the simple summing used in previous works. We demonstrate its state-of-the-art graph classification performance on benchmark datasets. For medical ontology embedding, we propose a novel hierarchical attention propagation model, which uses attention mechanism to learn embeddings of medical concepts from hierarchically-structured medical ontologies such as ICD-9 and CCS. We validate the learned embeddings on sequential procedure/diagnosis prediction tasks with real patient data.Then we investigate GNNs' potential for predicting relations, specifically link prediction and recommender systems. For link prediction, we first develop a theory unifying various traditional link prediction heuristics, and then design a framework to automatically learn suitable heuristics from a given network based on GNNs. Our model shows unprecedented strong link prediction performance, significantly outperforming all traditional methods. For recommender systems, we propose a novel graph-based matrix completion model, which uses a GNN to learn graph structure features from the bipartite graph formed by user and item interactions. Our model not only outperforms various matrix completion baselines, but also demonstrates excellent transfer learning ability -- a model trained on MovieLens can be directly used to predict Douban movie ratings with high performance.Finally, we explore GNNs' applicability to graph generation and graph structure optimization. We focus on a specific type of graphs which usually carry computations on them, namely directed acyclic graphs (DAGs). We develop a variational autoencoder (VAE) for DAGs and prove that it can injectively map computations into a latent space. This injectivity allows us to perform optimization in the continuous latent space instead of the original discrete structure space. We then apply our VAE to two types of DAGs, neural network architectures and Bayesian networks. Experiments show that our model not only generates novel and valid DAGs, but also finds high-quality neural architectures and Bayesian networks through performing Bayesian optimization in its latent space.
Michael Avidan, Sanmay Das, Roman Garnett, Brendan Juba,