How can we separate structural information from noise in large graphs? To address this fundamental question, we propose a graph summarization approach based on Szemer'edi's Regularity Lemma, a well-known result in graph theory, which roughly states that every graph can be approximated by the union of a small number of random-like bipartite graphs called ``regular pairs''. Hence, the Regularity Lemma provides us with a principled way to describe the essential structure of large graphs using a small amount of data. Our paper has several contributions: (i) We present our summarization algorithm which is able to reveal the main structural patterns in large graphs. (ii) We discuss how to use our summarization framework to efficiently retrieve from a database the top-\$k\$ graphs that are most similar to a query graph. (iii) Finally, we evaluate the noise robustness of our approach in terms of the reconstruction error and the usefulness of the summaries in addressing the graph search task.

### Separating Structure from Noise in Large Graphs Using the Regularity Lemma

#### Abstract

How can we separate structural information from noise in large graphs? To address this fundamental question, we propose a graph summarization approach based on Szemer'edi's Regularity Lemma, a well-known result in graph theory, which roughly states that every graph can be approximated by the union of a small number of random-like bipartite graphs called ``regular pairs''. Hence, the Regularity Lemma provides us with a principled way to describe the essential structure of large graphs using a small amount of data. Our paper has several contributions: (i) We present our summarization algorithm which is able to reveal the main structural patterns in large graphs. (ii) We discuss how to use our summarization framework to efficiently retrieve from a database the top-\$k\$ graphs that are most similar to a query graph. (iii) Finally, we evaluate the noise robustness of our approach in terms of the reconstruction error and the usefulness of the summaries in addressing the graph search task.
##### Scheda breve Scheda completa Scheda completa (DC) 2020
98
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: `https://hdl.handle.net/10278/3719412`
##### Citazioni
• ND
• 4
• 4