A multi-level approach for hierarchical Ticket Classification

The automatic categorization of support tickets is a fundamental tool for modern businesses. Such requests are most commonly composed of concise textual descriptions that are noisy and filled with technical jargon. In this paper, we test the effectiveness of pre-trained LMs for the classification of issues related to software bugs. First, we test several strategies to produce single, ticket-wise representations starting from their BERT-generated word embeddings. Then, we showcase a simple yet effective way to build a multi-level classifier for the categorization of documents with two hierarchically dependent labels. We experiment on a public bugs dataset and compare our results with standard BERT-based and traditional SVM classifiers. Our findings suggest that both embedding strategies and hierarchical label dependencies considerably impact classification accuracy.

A multi-level approach for hierarchical Ticket Classification

Marcuzzo Matteo^{Writing – Original Draft Preparation};Zangari Alessandro^{Writing – Original Draft Preparation};Schiavinato Michele^{Writing – Review & Editing};Giudice Lorenzo^{Writing – Review & Editing};Gasparetto Andrea^Supervision;Albarelli Andrea^Supervision

2022

Abstract

The automatic categorization of support tickets is a fundamental tool for modern businesses. Such requests are most commonly composed of concise textual descriptions that are noisy and filled with technical jargon. In this paper, we test the effectiveness of pre-trained LMs for the classification of issues related to software bugs. First, we test several strategies to produce single, ticket-wise representations starting from their BERT-generated word embeddings. Then, we showcase a simple yet effective way to build a multi-level classifier for the categorization of documents with two hierarchically dependent labels. We experiment on a public bugs dataset and compare our results with standard BERT-based and traditional SVM classifiers. Our findings suggest that both embedding strategies and hierarchical label dependencies considerably impact classification accuracy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2022
			
	Titolo del volume
	
				Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022)
			
	Appare nelle tipologie:
	
				4.1 Articolo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2022.wnut-1.22.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Creative commons Dimensione 1.46 MB Formato Adobe PDF Visualizza/Apri	1.46 MB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5017705

Citazioni

ND

ND

ND

social impact