Precision medicine aims to find the best individualized treatment for each patient. In particular, type-2 diabetes patients that present kidney complications (diabetic kidney disease, DKD) show relevant heterogeneity in the response to the therapeutic treatment. Aiming to develop a decision system to find the best individualized drug combination, we try to find subgroups of similar patients. Seeking a precise patients grouping, we compare two clustering methods. The first is based on the agglomerative hierarchical clustering with the Gower distance for mixed data, and the second is based on the k-medoids algorithm. The comparison of two patients (according to all their variables) with the Gower distance gives a scalar; the pairwise comparison of all patients gives a dissimilarity matrix for each time point. The k-medoids algorithm is based on a generalized distance, suitable for mixed data, and minimizes the distance between clusters. The comparison between methods is contextualized within the theoretical framework of category theory, which formalizes the idea of transformation between transformations. A category is constituted by objects (points) and morphisms (arrows) between them. Categories allow for nested comparisons. The morphisms between categories are called functors, and the comparison between functors is a natural transformation. A clustering method can be seen as a functor from a dataset equipped with distances to a partition of the dataset We can extend this idea to the comparison of clustering methods, formalizing it as a natural transformation. We compare these methods using the DC-ren longitudinal dataset, with mixed data of DKD patients. With both methods, we build clusters of similar patients, analyzing their mean values of variables and their response to the given drugs. The theoretical contextualization can help convert theorems and former knowledge from an abstract field to an applied one, giving new insights for further research and studies.

Comparison of clustering methods for diabetic kidney disease patients formalized through category theory

Maria Mannone;Veronica Distefano;Claudio Silvestri;Irene Poli
2021-01-01

Abstract

Precision medicine aims to find the best individualized treatment for each patient. In particular, type-2 diabetes patients that present kidney complications (diabetic kidney disease, DKD) show relevant heterogeneity in the response to the therapeutic treatment. Aiming to develop a decision system to find the best individualized drug combination, we try to find subgroups of similar patients. Seeking a precise patients grouping, we compare two clustering methods. The first is based on the agglomerative hierarchical clustering with the Gower distance for mixed data, and the second is based on the k-medoids algorithm. The comparison of two patients (according to all their variables) with the Gower distance gives a scalar; the pairwise comparison of all patients gives a dissimilarity matrix for each time point. The k-medoids algorithm is based on a generalized distance, suitable for mixed data, and minimizes the distance between clusters. The comparison between methods is contextualized within the theoretical framework of category theory, which formalizes the idea of transformation between transformations. A category is constituted by objects (points) and morphisms (arrows) between them. Categories allow for nested comparisons. The morphisms between categories are called functors, and the comparison between functors is a natural transformation. A clustering method can be seen as a functor from a dataset equipped with distances to a partition of the dataset We can extend this idea to the comparison of clustering methods, formalizing it as a natural transformation. We compare these methods using the DC-ren longitudinal dataset, with mixed data of DKD patients. With both methods, we build clusters of similar patients, analyzing their mean values of variables and their response to the given drugs. The theoretical contextualization can help convert theorems and former knowledge from an abstract field to an applied one, giving new insights for further research and studies.
2021
APPLIED STATISTICS 2021
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3743831
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact