
Text Mining Approaches for Dependent Bug Report Assembly and Severity Prediction
In general, most existing bug report studies focus only on solving a single specific issue. Considering of multiple
issues at one is required for a more complete and comprehensive process of bug fixing. We took up this challenge and
proposed a method to analyze two issues of bug reports based on text mining techniques. Firstly, dependent bug reports are
assembled into an individual cluster and then the bug reports in each cluster are analyzed for their severity. The method of
dependent bug report assembly is experimented with threshold-based similarity analysis. Cosine similarity and BM25 are
compared with term frequency (tf) weighting to obtain the most appropriate method. Meanwhile, four classification algorithms
namely Random Forest (RF), Support Vector Machines (SVM) with the RBF kernel function, Multinomial Naïve Bayes (MNB),
and k-Nearest Neighbor (k-NN) are utilized to model the bug severity predictor with four term weighting schemes, i.e., tf, term
frequency-inverse document frequency (tf-idf), term frequency-inverse class frequency (tf-icf), and term frequency-inverse
gravity moment (tf-igm). After the experimentation process, BM25 was found to be the most appropriate for dependent bug
report assemblage, while for severity prediction using tf-icf weighting on the RF method yielded the best performance value.
[46] Zhou Y., Tong Y., Gu R., and Gall H., “Combining Text Mining and Data Mining for Bug Report Classification,” Journal of Software: Evolution and Process, vol. 28, no. 3, pp. 150- 176, 2016. Bancha Luaphol received Ph.D. degree in Computer Science from Mahasarakham University. He currently works for Department of Digital Technology, Faculty of Administrative Science, Kalasin University, Thailand. He is currently engaged in the study of applications of natural language processing, and machine learning and deep learning approach. Jantima Polpinij received Ph.D. degree in Computer Science from University of Wollongong, Australia. She is an associate professor of computer science at Mahasarakham University, Thailand. Her research interest includes data science, natural language processing, text mining, and machine learning and deep learning approach. Manasawee Kaenampornpan received Ph.D. degree in Computer Science from University of Bath, UK. She is an assistant professor of computer science at Mahasarakham University, Thailand. Her research interests are user experience design, context awareness, mobile and ubiquitous computing.