This paper proposes extended association rule mining that can deal with correlation functions. The extended association rule is expressed in the form of: A double right arrow Correl(X; Y) where Correl(X; Y) is a correlation function with two variables X and Y. By this extension, data analysts can discover the condition A that lead to low (or high) correlation between two given variables from a large dataset. In order to show the efficacy of the proposed method, a case study is performed on an industry dataset of software developments, assuming the scenario of discovering a condition, where software development effort is predictable (or unpredictable) from the size of the project, i.e. there exists a significantly high (or low) correlation between size and effort. Since such a condition cannot be obtained by conventional association rule mining, we confirm the efficiency of the proposed extended association rule mining.

Extended Association Rule Mining with Correlation Functions

Yucel, Zeynep
2018-01-01

Abstract

This paper proposes extended association rule mining that can deal with correlation functions. The extended association rule is expressed in the form of: A double right arrow Correl(X; Y) where Correl(X; Y) is a correlation function with two variables X and Y. By this extension, data analysts can discover the condition A that lead to low (or high) correlation between two given variables from a large dataset. In order to show the efficacy of the proposed method, a case study is performed on an industry dataset of software developments, assuming the scenario of discovering a condition, where software development effort is predictable (or unpredictable) from the size of the project, i.e. there exists a significantly high (or low) correlation between size and effort. Since such a condition cannot be obtained by conventional association rule mining, we confirm the efficiency of the proposed extended association rule mining.
2018
Proc. 3rd IEEE/ACIS International Conference on Big Data, Cloud Computing, and Data Science Engineering (BCD 2018)
File in questo prodotto:
File Dimensione Formato  
c_16_bcd_extended.pdf

non disponibili

Tipologia: Documento in Pre-print
Licenza: Copyright dell'editore
Dimensione 136.63 kB
Formato Adobe PDF
136.63 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5080121
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact