Data augmentation is a widely adopted approach to solve the large-data requirements of modern deep learning techniques by generating new data instances from an existing dataset. While there is a huge literature and experience on augmentation for vectorial or image-based data, there is relatively little work on graph-based representations. This is largely due to complex, non-Euclidean structure of graphs, which limits our abilities to determine operations that do not modify the original semantic grouping. In this paper, we propose an alternative method for enlarging the graph set of graph neural network datasets by creating new graphs and keeping the properties of the originals. The proposal starts from the assumptions that the graphs compose a set of smaller motifs into larger structures. To this end, we extract modules by grouping nodes in an unsupervised way, and then swap similar modules between different graphs reconstructing the missing connectivity based on the original edge statistics and node similarity. We then test the performance of the proposed augmentation approach against state-of-the-art approaches, showing that on datasets, where the information is dominated by structure rather than node labels, we obtain a significant improvement with respect to alternatives.

GAMS: Graph Augmentation with Module Swapping

Torsello, Andrea
;
Bicciato, Alessandro
2022-01-01

Abstract

Data augmentation is a widely adopted approach to solve the large-data requirements of modern deep learning techniques by generating new data instances from an existing dataset. While there is a huge literature and experience on augmentation for vectorial or image-based data, there is relatively little work on graph-based representations. This is largely due to complex, non-Euclidean structure of graphs, which limits our abilities to determine operations that do not modify the original semantic grouping. In this paper, we propose an alternative method for enlarging the graph set of graph neural network datasets by creating new graphs and keeping the properties of the originals. The proposal starts from the assumptions that the graphs compose a set of smaller motifs into larger structures. To this end, we extract modules by grouping nodes in an unsupervised way, and then swap similar modules between different graphs reconstructing the missing connectivity based on the original edge statistics and node similarity. We then test the performance of the proposed augmentation approach against state-of-the-art approaches, showing that on datasets, where the information is dominated by structure rather than node labels, we obtain a significant improvement with respect to alternatives.
2022
Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3754007
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact