How can we separate structural information from noise in large graphs? To address this fundamental question, we propose a graph summarization approach based on Szemer'edi's Regularity Lemma, a well-known result in graph theory, which roughly states that every graph can be approximated by the union of a small number of random-like bipartite graphs called ``regular pairs''. Hence, the Regularity Lemma provides us with a principled way to describe the essential structure of large graphs using a small amount of data. Our paper has several contributions: (i) We present our summarization algorithm which is able to reveal the main structural patterns in large graphs. (ii) We discuss how to use our summarization framework to efficiently retrieve from a database the top-$k$ graphs that are most similar to a query graph. (iii) Finally, we evaluate the noise robustness of our approach in terms of the reconstruction error and the usefulness of the summaries in addressing the graph search task.
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.