Automated functional dependency detection between test cases using Doc2Vec and ClusteringShow others and affiliations
2019 (English)In: Proceedings - 2019 IEEE International Conference on Artificial Intelligence Testing, AITest 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 19-26Conference paper, Published paper (Refereed)
Abstract [en]
Knowing about dependencies and similarities between test cases is beneficial for prioritizing them for cost-effective test execution. This holds especially true for the time consuming, manual execution of integration test cases written in natural language. Test case dependencies are typically derived from requirements and design artifacts. However, such artifacts are not always available, and the derivation process can be very time-consuming. In this paper, we propose, apply and evaluate a novel approach that derives test cases' similarities and functional dependencies directly from the test specification documents written in natural language, without requiring any other data source. Our approach uses an implementation of Doc2Vec algorithm to detect text-semantic similarities between test cases and then groups them using two clustering algorithms HDBSCAN and FCM. The correlation between test case text-semantic similarities and their functional dependencies is evaluated in the context of an on-board train control system from Bombardier Transportation AB in Sweden. For this system, the dependencies between the test cases were previously derived and are compared to the results our approach. The results show that of the two evaluated clustering algorithms, HDBSCAN has better performance than FCM or a dummy classifier. The classification methods' results are of reasonable quality and especially useful from an industrial point of view. Finally, performing a random undersampling approach to correct the imbalanced data distribution results in an F1 Score of up to 75% when applying the HDBSCAN clustering algorithm.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2019. p. 19-26
Keywords [en]
Clustering Doc2Vec, FCM, HDBSCAN, Paragraph Vectors, Software Testing, Test Case Dependency, Artificial intelligence, Cost effectiveness, Semantics, Testing, Bombardier Transportation, Classification methods, Functional dependency, Random under samplings, Test case, Train control systems, Clustering algorithms
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-39269DOI: 10.1109/AITest.2019.00-13Scopus ID: 2-s2.0-85067096441ISBN: 9781728104928 (print)OAI: oai:DiVA.org:ri-39269DiVA, id: diva2:1334740
Conference
1st IEEE International Conference on Artificial Intelligence Testing, AITest 2019, 4 April 2019 through 9 April 2019
Note
Funding details: 20130085, 20160139; Funding details: VINNOVA, MegaM@RT2; Funding text 1: ECSEL & VINNOVA (through projects MegaM@RT2 & TESTOMAT) and the Swedish Knowledge Foundation (through the projects TOCSYC (20130085) and TestMine (20160139)) have supported this work.
2019-07-032019-07-032020-01-29Bibliographically approved