Vulnerability detection is particularly relevant in smart contracts, where modifying the code after deployment is impossible. Machine learning solutions provide greater efficiency than static analyzers in speed and detection. This study evaluates various classic machine-learning techniques and state-of-the-art neural networks for training a vulnerability detector. We analyze the largest and most reliably labelled dataset of smart contracts currently available, experimenting with six data representations of smart contracts and a multimodal approach. Our experiments show that both deep and traditional machine learning methods excel in different scenarios. Notably, eXtreme Gradient Boosting achieved an F1-score of 0.91 with the multimodal approach, which suggests its potential for more robust classification. At the same time, the results underscore the need for larger datasets to showcase the full potential of the evaluated methods.

A Comparison of Machine Learning Techniques for Ethereum Smart Contract Vulnerability Detection

Rizzo M.
Membro del Collaboration Group
;
Ressi D.
Membro del Collaboration Group
;
Rossi S.
Membro del Collaboration Group
2025-01-01

Abstract

Vulnerability detection is particularly relevant in smart contracts, where modifying the code after deployment is impossible. Machine learning solutions provide greater efficiency than static analyzers in speed and detection. This study evaluates various classic machine-learning techniques and state-of-the-art neural networks for training a vulnerability detector. We analyze the largest and most reliably labelled dataset of smart contracts currently available, experimenting with six data representations of smart contracts and a multimodal approach. Our experiments show that both deep and traditional machine learning methods excel in different scenarios. Notably, eXtreme Gradient Boosting achieved an F1-score of 0.91 with the multimodal approach, which suggests its potential for more robust classification. At the same time, the results underscore the need for larger datasets to showcase the full potential of the evaluated methods.
2025
CEUR Workshop Proceedings
File in questo prodotto:
File Dimensione Formato  
Overlay_2024_.pdf

non disponibili

Tipologia: Documento in Pre-print
Licenza: Copyright dell'editore
Dimensione 650.32 kB
Formato Adobe PDF
650.32 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5105052
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact