Software Architectural Process (SAP) is a core and excessively knowledge intensive phase of software development life cycle, as it consumes and produces knowledge artifacts, simultaneously. SAP is about making design decisions, and the changes in these verdicts may pose adverse effects on software projects. The performance and properties of software components are fundamentally influenced by the design decisions. The implementation of immature and abrupt design decisions seriously threatens the development process of SAP. Moreover, software architectural knowledge management (AKM) approaches offer systematic ways to support SAP through versatile architectural solutions and design decisions. However, the majority of software organizations have limited access to data and still depend upon manually created and maintained AKM process. In this paper, we have utilized the one of the most prominent online community for software development (i.e., Stack Overflow) as a source of SAP knowledge to support AKM. In order to support AKM, we have proposed a supervised machine learning‐based approach to classify the architectural knowledge into predefined categories, that is, analysis, synthesis, evaluation, and implementation. We have employed different combinations of feature selection technique to achieve the optimal classification results of the used classifiers (Support Vector Machine [SVM], K‐Nearest Neighbor, Random Forest, and Naive Bayes [NB]). Among these classifiers, SVM with Uni‐gram feature set provides best classification results and attains 85.80% accuracy. For evaluating the proposed approach's effectiveness, we have also computed the suitability of the classifiers, that is, the cost of computation along with its accuracy, and NB with Uni‐gram feature set proved to be the most suitable.

Mining Software Architecture Knowledge: Classifying Stack Overflow Posts Using Machine Learning

Anees Baqir
Formal Analysis
;
2021-01-01

Abstract

Software Architectural Process (SAP) is a core and excessively knowledge intensive phase of software development life cycle, as it consumes and produces knowledge artifacts, simultaneously. SAP is about making design decisions, and the changes in these verdicts may pose adverse effects on software projects. The performance and properties of software components are fundamentally influenced by the design decisions. The implementation of immature and abrupt design decisions seriously threatens the development process of SAP. Moreover, software architectural knowledge management (AKM) approaches offer systematic ways to support SAP through versatile architectural solutions and design decisions. However, the majority of software organizations have limited access to data and still depend upon manually created and maintained AKM process. In this paper, we have utilized the one of the most prominent online community for software development (i.e., Stack Overflow) as a source of SAP knowledge to support AKM. In order to support AKM, we have proposed a supervised machine learning‐based approach to classify the architectural knowledge into predefined categories, that is, analysis, synthesis, evaluation, and implementation. We have employed different combinations of feature selection technique to achieve the optimal classification results of the used classifiers (Support Vector Machine [SVM], K‐Nearest Neighbor, Random Forest, and Naive Bayes [NB]). Among these classifiers, SVM with Uni‐gram feature set provides best classification results and attains 85.80% accuracy. For evaluating the proposed approach's effectiveness, we have also computed the suitability of the classifiers, that is, the cost of computation along with its accuracy, and NB with Uni‐gram feature set proved to be the most suitable.
File in questo prodotto:
File Dimensione Formato  
Wiley.pdf

non disponibili

Tipologia: Versione dell'editore
Licenza: Accesso chiuso-personale
Dimensione 1.17 MB
Formato Adobe PDF
1.17 MB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3738491
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 2
social impact