The discovery of proteomic biomarkers in cancer research can be effectively performed in situ by exploiting Matrix-Assisted Laser Desorption Ionization (MALDI) Mass Spectrometry Imaging (MSI). However, due to experimental limitations, the spectra extracted by MALDI-MSI can be noisy, so pre-processing steps are generally needed to reduce the instrumental and analytical variability. Thus far, the importance and the effect of standard pre-processing methods, as well as their combinations and parameter settings, have not been extensively investigated in proteomics applications. In this work, we present a systematic study of 15 combinations of pre-processing steps—including baseline, smoothing, normalization, and peak alignment—for a real-data classification task on MALDI-MSI data measured from fine-needle aspirates biopsies of thyroid nodules. The influence of each combination was assessed by analyzing the feature extraction, pixel-by-pixel classification probabilities, and LASSO classification performance. Our results highlight the necessity of fine-tuning a pre-processing pipeline, especially for the reliable transfer of molecular diagnostic signatures in clinical practice. We outline some recommendations on the selection of pre-processing steps, together with filter levels and alignment methods, according to the mass-to-charge range and heterogeneity of data.
Well Begun Is Half Done: The Impact of Pre-Processing in MALDI Mass Spectrometry Imaging Analysis Applied to a Case Study of Thyroid Nodules
Capitoli, Giulia;Nobile, Marco S.;
2025-01-01
Abstract
The discovery of proteomic biomarkers in cancer research can be effectively performed in situ by exploiting Matrix-Assisted Laser Desorption Ionization (MALDI) Mass Spectrometry Imaging (MSI). However, due to experimental limitations, the spectra extracted by MALDI-MSI can be noisy, so pre-processing steps are generally needed to reduce the instrumental and analytical variability. Thus far, the importance and the effect of standard pre-processing methods, as well as their combinations and parameter settings, have not been extensively investigated in proteomics applications. In this work, we present a systematic study of 15 combinations of pre-processing steps—including baseline, smoothing, normalization, and peak alignment—for a real-data classification task on MALDI-MSI data measured from fine-needle aspirates biopsies of thyroid nodules. The influence of each combination was assessed by analyzing the feature extraction, pixel-by-pixel classification probabilities, and LASSO classification performance. Our results highlight the necessity of fine-tuning a pre-processing pipeline, especially for the reliable transfer of molecular diagnostic signatures in clinical practice. We outline some recommendations on the selection of pre-processing steps, together with filter levels and alignment methods, according to the mass-to-charge range and heterogeneity of data.| File | Dimensione | Formato | |
|---|---|---|---|
|
stats-08-00057-v2.pdf
accesso aperto
Tipologia:
Versione dell'editore
Licenza:
Creative commons
Dimensione
16.73 MB
Formato
Adobe PDF
|
16.73 MB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



