GAM Forest Explanation

Most accurate machine learning models unfortunately produce black-box predictions, for which it is impossible to grasp the internal logic that leads to a specific decision. Unfolding the logic of such black-box models is of increasing importance, especially when they are used in sensitive decision-making processes. In this work we focus on forests of decision trees, which may include hundreds to thousands of decision trees to produce accurate predictions. Such complexity raises the need of developing explanations for the predictions generated by large forests. We propose a post hoc explanation method of large forests, named GAM-based Explanation of Forests (GEF), which builds a Generalized Additive Model (GAM) able to explain, both locally and globally, the impact on the predictions of a limited set of features and feature interactions. We evaluate GEF over both synthetic and real-world datasets and show that GEF can create a GAM model with high fidelity by analyzing the given forest only and without using any further information, not even the initial training dataset.

GAM Forest Explanation

Lucchese C.;Orlando S.;Perego R.;Veneri A.

2023

Abstract

Most accurate machine learning models unfortunately produce black-box predictions, for which it is impossible to grasp the internal logic that leads to a specific decision. Unfolding the logic of such black-box models is of increasing importance, especially when they are used in sensitive decision-making processes. In this work we focus on forests of decision trees, which may include hundreds to thousands of decision trees to produce accurate predictions. Such complexity raises the need of developing explanations for the predictions generated by large forests. We propose a post hoc explanation method of large forests, named GAM-based Explanation of Forests (GEF), which builds a Generalized Additive Model (GAM) able to explain, both locally and globally, the impact on the predictions of a limited set of features and feature interactions. We evaluate GEF over both synthetic and real-world datasets and show that GEF can create a GAM model with high fidelity by analyzing the given forest only and without using any further information, not even the initial training dataset.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2023
			
	Titolo del volume
	
				EDBT 2023 - Proceedings 26th International Conference on Extending Database Technology
			
	DOI
	
				https://dx.doi.org/10.48786/edbt.2023.14
			
	Appare nelle tipologie:
	
				4.1 Articolo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
paper-207.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Creative commons Dimensione 2.28 MB Formato Adobe PDF Visualizza/Apri	2.28 MB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5003875

Citazioni

ND

1

ND

social impact