We present a unified approach for simultaneous clustering and outlier detection in data. We utilize some properties of a family of quadratic optimization problems related to dominant sets, a well-known graph-theoretic notion of a cluster which generalizes the concept of a maximal clique to edge-weighted graphs. Unlike most (all) of the previous techniques, in our framework the number of clusters arises intuitively and outliers are obliterated automatically. The resulting algorithm discovers both parameters from the data. Experiments on real and on large scale synthetic dataset demonstrate the effectiveness of our approach and the utility of carrying out both clustering and outlier detection in a concurrent manner.
Simultaneous Clustering and Outlier Detection using Dominant sets
MEQUANINT, EYASU ZEMENE;PELILLO, Marcello
2016-01-01
Abstract
We present a unified approach for simultaneous clustering and outlier detection in data. We utilize some properties of a family of quadratic optimization problems related to dominant sets, a well-known graph-theoretic notion of a cluster which generalizes the concept of a maximal clique to edge-weighted graphs. Unlike most (all) of the previous techniques, in our framework the number of clusters arises intuitively and outliers are obliterated automatically. The resulting algorithm discovers both parameters from the data. Experiments on real and on large scale synthetic dataset demonstrate the effectiveness of our approach and the utility of carrying out both clustering and outlier detection in a concurrent manner.I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.