Reliability and efficiency are two of the main concerns for software designs. These quality objectives can be met through efficient design and bug management. Developers and stakeholders report a large number of software bugs periodically. These bug reports often lack proper descriptions of the issues and the correct linked entities. Thus, it requires efficient bug localization and mapping the reported bugs to the exact soft-ware features for resolution. Most of the related research works primarily focus on bug classification and categorization to identify high-priority bugs. This work proposes a novel methodology towards analyzing the bugs, and accordingly identifying relevant software features. The proposed methodology uses large language models (LLM) for mapping bug reports to relevant software features. The effectiveness of this approach is evaluated using two state-of-the art LLM models which are employed using naive Retrieval Augmented Generation (RAG) and advanced RAG. The comparative study identifies fascinating distinctions between advanced RAG and naive RAG, results in identifying software features. These findings highlight the potential of LLM powered retrieval methods in improving automated bug localization, paving the way for more efficient software maintenance and debugging workflows.

A Hybrid RAG Framework for Bug-to-Feature Mapping

Roy, Mandira;Das, Souvick;Chaki, Nabendu;Cortesi, Agostino
2025-01-01

Abstract

Reliability and efficiency are two of the main concerns for software designs. These quality objectives can be met through efficient design and bug management. Developers and stakeholders report a large number of software bugs periodically. These bug reports often lack proper descriptions of the issues and the correct linked entities. Thus, it requires efficient bug localization and mapping the reported bugs to the exact soft-ware features for resolution. Most of the related research works primarily focus on bug classification and categorization to identify high-priority bugs. This work proposes a novel methodology towards analyzing the bugs, and accordingly identifying relevant software features. The proposed methodology uses large language models (LLM) for mapping bug reports to relevant software features. The effectiveness of this approach is evaluated using two state-of-the art LLM models which are employed using naive Retrieval Augmented Generation (RAG) and advanced RAG. The comparative study identifies fascinating distinctions between advanced RAG and naive RAG, results in identifying software features. These findings highlight the potential of LLM powered retrieval methods in improving automated bug localization, paving the way for more efficient software maintenance and debugging workflows.
2025
Computer Information Systems and Industrial Management
File in questo prodotto:
File Dimensione Formato  
Tirthankar_cisim2025.pdf

non disponibili

Descrizione: published
Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 826.23 kB
Formato Adobe PDF
826.23 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5101788
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact