In Pursuit of the Hidden Features of GNN's Internal Representations

From LRDE

Abstract

We consider the problem of explaining Graph Neural Networks (GNNs). While most attempts aim at explaining the final decision of the model, we focus on the hidden layers to examine what the GNN actually captures and shed light on the hidden features built by the GNN. To that end, we first extract activation rules that identify sets of exceptionally co-activated neurons when classifying graphs in the same category. These rules define internal representations having a strong impact in the classification process. Then - this is the goal of the current paper - we interpret these rules by identifying a graph that is fully embedded in the related subspace identified by the rule. The graph search is based on a Monte Carlo Tree Search directed by a proximity measure between the graph embedding and the internal representation of the rule, as well as a realism factor that constrains the distribution of the labels of the graph to be similar to that observed on the dataset. Experiments including 6 real-world datasets and 3 baselines demonstrate that our method DISCERN generates realistic graphs of high quality which allows providing new insights into the respective GNN models.


Bibtex (lrde.bib)

@Article{	  veyrin-forrer.22.dke,
  title		= {In Pursuit of the Hidden Features of {GNN}'s Internal
		  Representations},
  journal	= {Data \& Knowledge Engineering},
  volume	= {142},
  pages		= {102097},
  year		= {2022},
  month		= nov,
  issn		= {0169-023X},
  doi		= {10.1016/j.datak.2022.102097},
  author	= {Luca Veyrin-Forrer and Ataollah Kamal and Stefan Duffner
		  and Marc Plantevit and C\'{e}line Robardet},
  keywords	= {Graph Neural Networks, Explainable artificial
		  intelligence, Monte Carlo Tree Search},
  publisher	= {Elsevier},
  abstract	= {We consider the problem of explaining Graph Neural
		  Networks (GNNs). While most attempts aim at explaining the
		  final decision of the model, we focus on the hidden layers
		  to examine what the GNN actually captures and shed light on
		  the hidden features built by the GNN. To that end, we first
		  extract activation rules that identify sets of
		  exceptionally co-activated neurons when classifying graphs
		  in the same category. These rules define internal
		  representations having a strong impact in the
		  classification process. Then - this is the goal of the
		  current paper - we interpret these rules by identifying a
		  graph that is fully embedded in the related subspace
		  identified by the rule. The graph search is based on a
		  Monte Carlo Tree Search directed by a proximity measure
		  between the graph embedding and the internal representation
		  of the rule, as well as a realism factor that constrains
		  the distribution of the labels of the graph to be similar
		  to that observed on the dataset. Experiments including 6
		  real-world datasets and 3 baselines demonstrate that our
		  method DISCERN generates realistic graphs of high quality
		  which allows providing new insights into the respective GNN
		  models. }
}