Table of Contents
Graphs can accurately represent most real-world data. Graph classification is an important graph mining task. It deals with predicting which label or category the graph belongs to and has many applications such as classifying chemical compounds, function prediction of protein structures, social network analysis. Traditional approaches focus on extracting graph statistics or features and comparing them for similarity. Graph kernel is one such popular method. The similarities are computed using different measures like performing random walks on both graphs or computing all pair shortest path. The similarity matrix can be provided as input to any kernelized algorithms such as Support Vector Machines (SVM) for classification. Though graph similarity can be computed in polynomial time using graph kernels, the time complexity is still very high and performance for large graphs is very low.
With the introduction of deep learning framework, many methods have been proposed to derive graphlet features and use them for classification. Convolutional neural networks (CNN) are widely used for image recognition and classification. A graph can be represented as an image-like structure to be processed by CNN using graph node embedding techniques and computing 2D histograms. Attention-based classification involves processing only informative nodes in the graph without knowing the global structure of the graph.
We plan on performing an attention-guided walk on the graph using a Recurrent Neural Network (RNN) model. The model is trained using reinforcement learning to select the next nodes to process. In this research, we compare the CNN model and deep reinforcement learning model with the popular graph kernel approach. Also, we conduct experiments using attention-based classification to analyse how performance increases. For large graphs, we plan on deploying multiple agents in parallel to determine the graph label. We will be performing our experiments on the National Cancer Institute (NCI) datasets to classify if the cell is cancerous or not and compare machine learning and deep learning methods.
PROJECT DELIVERABLES
- Train NCI datasets on three graph kernels: Random walk, Shortest path and Weisfeiler-Lehman kernel. This would be used as a baseline for evaluating other models.
- Represent the graph as a histogram and train a CNN for classification.
- Train a model with a combination of deep learning (Long short-term memory network) and reinforcement learning to classify graph using attention model [1].
- Compare the resulting accuracies with baseline and analyse the results.
- Documentation of the CS 298 report.
CHALLENGE AND INNOVATION
- CNN works well for images. Representing graphs in a form CNN can work with, is a challenge [2].
- Performing attention-based classification on graph without exploring the entire graph using deep reinforcement learning techniques [1].
- If the graph is large, it might be impossible to load it into memory. Performance should be improved as well by deploying multiple agents and running them in parallel.
REFERENCES
- J. B. Lee, R. Rossi, X. Kong, “Graph classification using structural attention,” presented filleat the KDD 2018 Proceedings of the 24th ACM SIGKDD International Conference on filleKnowledge Discovery & Data Mining, London, United Kingdom, pp 1666-1674.
- G. Nikolentzos, P. Meladianos, M. Vazirgiannis, “Matching node embeddings for graph fillesimilairty,” in Proceedings of the 31st AAAI-17 conference on Artificial Intelligence, San filleFrancisco, California.
- H. Wang, N. Wang, D. Yeung, “Convolutional networks on graphs for learning molecular fillefingerprints,” in KDD ’15 Proceedings of the 21st ACM SIGKDD, Sydney, NSW, filleAustralia, pp 1235-1244.
- G. Nikolentzos, P. Meladianos, A. J. Tixier, K. Skianis, M. Vazirgiannis, “Kernel graph filleconvolutional networks,” in NIPS’15 Proceedings of the 28th International Conference on filleNeural Information Processing Systems, vol. 2, Dec. 2015, Montreal, Canada.
- M. Zhang, Z. Cui, Y. Chen, “An end-to-end deep learning architecture for graph filleclassification,” in Proceedings of the 32nd AAAI-18 conference on Artificial Intelligence, filleNew Orleans, Louisiana.