TY - JOUR TI - Wrapper for building classification models using Covering Arrays T2 - IEEE Access SP - 148297 EP - 148312 AU - H. Dorado AU - C. Cobos AU - J. Torres-Jimenez AU - D. D. Burra AU - M. Mendoza AU - D. Jimenez PY - 2019 KW - Feature extraction KW - Genetic algorithms KW - Arrays KW - Search problems KW - Buildings KW - Testing KW - Particle swarm optimization KW - Classification algorithms KW - Covering arrays KW - Random forest KW - Support vector machines KW - Genetic algorithms KW - Particle swarm optimization DO - 10.1109/ACCESS.2019.2944641 JO - IEEE Access IS - 1 SN - 2169-3536 VO - VL - 7 JA - IEEE Access Y1 - AB - Wrapper methods are a type of feature selection method that finds a subset of variables to improve the performance of a classifier by removing redundant and irrelevant variables. The use of a wrapper implies that each time a candidate solution is explored, the classifier is evaluated on the quality measures selected (e.g. accuracy or precision). Though robust, this iteration across several candidate solutions can become computationally intensive and time-consuming. In this paper we propose a wrapper, that is based on binary Covering Arrays (CAs), and binary Incremental Covering Arrays (ICAs), that have been widely used for experimental design and fault detection in software and hardware testing. The new wrapper was evaluated with six classifiers on seven data sets. The results show that the CAs and ICAs with strength 6 significantly improve the performance and reduces the number of variables required by the classifier. A comparative analysis of the proposed method against wrappers based on other search approaches such as genetic algorithms (GA) and particle swarm optimization (PSO), shows that the proposed method yields results similar to GA, but not to PSO, with differences to PSO, in accuracy, which in the majority of cases is below 0.04. This lack of accuracy, by which the new wrapper fails to match PSO, is offset by the fact that the user does not need to fine tune algorithm parameters, such as velocity ranges, timing, cognitive coefficient, and social coefficient, while it is also much easier to program in parallel. ER -