Modern data collection techniques allow to analyze a very large number of endpoints. In biomedical research, for example, expressions of thousands of genes are commonly measured only on a small number of subjects. In these situations, traditional methods for comparison studies are not applicable. Moreover, the assumption of normal distribution is often questionable for high-dimensional data, and some variables may be at the same time highly correlated with others. Hypothesis tests based on interpoint distances are very appealing for studies involving the comparison of means, because they do not assume data to come from normally distributed populations and comprise tests that are distribution free, unbiased, consistent, and computationally feasible, even if the number of endpoints is much larger than the number of subjects. New tests based on interpoint distances are proposed for multivariate studies involving simultaneous comparison of means and variability, or the whole distribution shapes. The tests are shown to perform well in terms of power, when the endpoints have complex dependence relations, such as in genomic and metabolomic studies. A practical application to a genetic cardiovascular case-control study is discussed.
Marco Marozzi (Corresponding)
|Data di pubblicazione:||2020|
|Titolo:||Interpoint distance tests for high-dimensional comparison studies|
|Rivista:||JOURNAL OF APPLIED STATISTICS|
|Appare nelle tipologie:||2.1 Articolo su rivista |
File in questo prodotto:
|jas2020.pdf||articolo completo||Versione dell'editore||Accesso chiuso-personale||Riservato|