Abstract: One of the main problems in the statistical analysis of Evolutionary Computation (EC) experiments is the ‘statistical personality’ of data. A main feature of EC algorithms is the sampling of solutions from one generation to the next. Sampling is based on Holland’s schema theory, having a greater probability to be chosen those solutions with best-fitness (or evaluation) values. In consequence, simulation experiments result in biased samples with non-normal, highly skewed, and asymmetric distributions. Furthermore, the main problem arises with the noncompliance of one of the main premises of the central limit theorem, invalidating the statistical analysis based on the average fitness of the solutions. In this paper, we address a tutorial or ‘How-to’ explaining the basics of the statistical analysis of data in EC. The use of nonparametric tests for comparing two or more medians combined with Exploratory Data Analysis is a good option, bearing in mind that we are only considering two experimental situations that are common in EC practitioners: (i) the performance evaluation of an algorithm and (ii) the multiple experiments comparison. The different approaches are illustrated with different examples (see http://bioinformatica.net/tests/survey.html) selected from Evolutionary Computation and the related field of Artificial Life.
Keywords: Evolutionary Computation, Statistical Analysis and Simulation.
ACM Classification Keywords: G.3 PROBABILITY AND STATISTICS
Link:
A SURVEY OF NONPARAMETRIC TESTS FOR THE STATISTICAL ANALYSIS OF EVOLUTIONARY COMPUTATIONAL EXPERIMENTS
Rafael Lahoz-Beltra?, Carlos Perales-Gravan?
http://foibg.com/ijita/vol17/ijita17-1-p07.pdf