-*- mode: org -*- * 3(<-596): MOP/GP models for machine learning Techniques for machine learning have been extensively studied in recent years as effective tools in data mining. Although there have been several approaches to machine learning, we focus on the mathematical programming (in particular, multi-objective and goal programming; MOP/GP) approaches in this paper. Among them, Support Vector Machine (SVM) is gaining much popularity recently. In pattern classification problems with two class sets, its idea is to find a maximal margin separating hyperplane which gives the greatest separation between the classes in a high dimensional feature space. This task is performed by solving a quadratic programming problem in a traditional formulation, and can be reduced to solving a linear programming in another formulation. However, the idea of maximal margin separation is not quite new: in the 1960s the multi-surface method (MSM) was suggested by Mangasarian. In the 1980s, linear classifiers using goal programming were developed extensively. This paper presents an overview on how effectively MOP/GP techniques can be applied to machine learning such as SVM, and discusses their problems. (c) 2004 Elsevier B.V. All rights reserved. 2005 * 4(<-614): Study on Support Vector Machines Using Mathematical Programming Machine learning has been extensively studied in recent years as eective tools inpattern classication problem. Although there have been several approaches to machinelearning, we focus on the mathematical programming (in particular, multi-objective andgoal programming; MOP/GP) approaches in this paper. Among them, Support VectorMachine (SVM) is gaining much popularity recently. In pattern classication problemwith two class sets, the idea is to nd a maximal margin separating hyperplane whichgives the greatest separation between the classes in a high dimensional feature space.However, the idea of maximal margin separation is not quite new: in 1960's the multi-surface method (MSM) was suggested by Mangasarian. In 1980's, linear classiersusing goal programming were developed extensively. This paper proposes a new familyof SVM using MOP/GP techniques, and discusses its eectiveness throughout severalnumerical experiments. 2005 * 6(<-465): A Multiobjective Genetic SVM Approach for Classification Problems With Limited Training Samples In this paper, a novel method for semsupervised classification with limited training samples is presented. Its aim is to exploit unlabeled data available at zero cost in the image under analysis for improving the accuracy of a classification process based on support vector machines (SVMs). It is based on the idea to augment the original set of training samples with a set of unlabeled samples after estimating their label. The label estimation process is performed within a multiobjective genetic optimization framework where each chromosome of the evolving population encodes the label estimates as well as the SVM classifier parameters for tackling the model selection issue. Such a process is guided by the joint minimization of two different criteria which express the generalization capability of the SVM classifier. The two explored criteria are an empirical risk measure and an indicator of the classification model sparseness, respectively. The experimental results obtained on two multisource remote sensing data sets confirm the promising capabilities of the proposed approach, which allows the following: 1) taking a clear advantage in terms of classification accuracy from unlabeled samples used for inflating the original training set and 2) solving automatically the tricky model selection issue. 2009 * 8(<-514): Genetic SVM approach to semisupervised multitemporal classification The updating of classification maps, as new image acquisitions are obtained, raises the problem of ground-truth information (training samples) updating. In this context, semisupervised multitemporal classification represents an interesting though still not well consolidated approach to tackle this issue. In this letter, we propose a novel methodological solution based on this approach. Its underlying idea is to update the ground-truth information through an automatic estimation process, which exploits archived ground-truth information as well as basic indications from the user about allowed/forbidden class transitions from an acquisition date to another. This updating problem is formulated by means of the support vector machine classification approach and a constrained multiobjective optimization genetic algorithm. Experimental results on a multitemporal data set consisting of two multisensor (Landsat-5 Thematic Mapper and European Remote Sensing satellite synthetic aperture radar) images are reported and discussed. 2008 * 13(<- 90): A hybrid meta-learning architecture for multi-objective optimization of SVM parameters Support Vector Machines (SVMs) have achieved a considerable attention due to their theoretical foundations and good empirical performance when compared to other learning algorithms in different applications. However, the SVM performance strongly depends on the adequate calibration of its parameters. In this work we proposed a hybrid multi-objective architecture which combines meta-learning (ML) with multi-objective particle swarm optimization algorithms for the SVM parameter selection problem. Given an input problem, the proposed architecture uses a ML technique to suggest an initial Pareto front of SVM configurations based on previous similar learning problems; the suggested Pareto front is then refined by a multi-objective optimization algorithm. In this combination, solutions provided by ML are possibly located in good regions in the search space. Hence, using a reduced number of successful candidates, the search process would converge faster and be less expensive. In the performed experiments, the proposed solution was compared to traditional multi-objective algorithms with random initialization, obtaining Pareto fronts with higher quality on a set of 100 classification problems. (C) 2014 Elsevier B.V. All rights reserved. 2014 * 19(<-585): Multiobjective analysis of chaotic dynamic systems with sparse learning machines Sparse learning machines provide a viable framework for modeling chaotic time-series systems. A powerful state-space reconstruction methodology using both support vector machines (SVM) and relevance vector machines (RVM) within a multiobjective optimization framework is presented in this paper. The utility and practicality of the proposed approaches have been demonstrated on the time series of the Great Salt Lake (GSL) biweekly volumes from 1848 to 2004. A comparison of the two methods is made based on their predictive power and robustness. The reconstruction of-the dynamics of the Great Salt Lake volume time series is attained using the most relevant feature subset of the training data. In this paper, efforts are also made to assess the uncertainty and robustness of the machines in learning and forecasting as a function of model structure, model parameters, and bootstrapping samples. The resulting model will normally have a structure, including parameterization, that suits the information content of the available data, and can be used to develop time series forecasts for multiple lead times ranging from two weeks to several months. (c) 2005 Elsevier Ltd. All rights reserved. 2006 * 20(<-157): Leave-one-out cross-validation-based model selection for multi-input multi-output support vector machine As an effective approach for multi-input multi-output regression estimation problems, a multi-dimensional support vector regression (SVR), named M-SVR, is generally capable of obtaining better predictions than applying a conventional support vector machine (SVM) independently for each output dimension. However, although there are many generalization error bounds for conventional SVMs, all of them cannot be directly applied to M-SVR. In this paper, a new leave-one-out (LOO) error estimate for M-SVR is derived firstly through a virtual LOO cross-validation procedure. This LOO error estimate can be straightway calculated once a training process ended with less computational complexity than traditional LOO method. Based on this LOO estimate, a new model selection methods for M-SVR based on multi-objective optimization strategy is further proposed in this paper. Experiments on toy noisy function regression and practical engineering data set, that is, dynamic load identification on cylinder vibration system, are both conducted, demonstrating comparable results of the proposed method in terms of generalization performance and computational cost. 2014 * 23(<-609): Multi-objective model selection for support vector machines In this article, model selection for support vector machines is viewed as a multi-objective optimization problem, where model complexity and training accuracy define two conflicting objectives. Different optimization criteria are evaluated: Split modified radius margin bounds, which allow for comparing existing model selection criteria, and the training error in conjunction with the number of support vectors for designing sparse solutions. 2005 * 26(<-150): A novel feature selection method for twin support vector machine Both support vector machine (SVM) and twin support vector machine (TWSVM) are powerful classification tools. However, in contrast to many SVM-based feature selection methods, TWSVM has not any corresponding one due to its different mechanism up to now. In this paper, we propose a feature selection method based on TWSVM, called FTSVM. It is interesting because of the advantages of TWSVM in many cases. Our FTSVM is quite different from the SVM-based feature selection methods. In fact, linear SVM constructs a single separating hyperplane which corresponds a single weight for each feature, whereas linear TWSVM constructs two fitting hyperplanes which corresponds to two weights for each feature. In our linear FTSVM, in order to link these two fitting hyperplanes, a feature selection matrix is introduced. Thus, the feature selection becomes to find an optimal matrix, leading to solve a multi-objective mixed-integer programming problem by a greedy algorithm. In addition, the linear FTSVM has been extended to the nonlinear case. Furthermore, a feature ranking strategy based on FTSVM is also suggested. The experimental results on several public available benchmark datasets indicate that our FTSVM not only gives nice feature selection on both linear and nonlinear cases but also improves the performance of TWSVM efficiently. (C) 2014 Elsevier B.V. All rights reserved. 2014 * 30(<- 8): Novel approaches using evolutionary computation for sparse least square support vector machines This paper introduces two new approaches to building sparse least square support vector machines (LSSVM) based on genetic algorithms (GAs) for classification tasks. LSSVM classifiers are an alternative to SVM ones because the training process of LSSVM classifiers only requires to solve a linear equation system instead of a quadratic programming optimization problem. However, the absence of sparseness in the Lagrange multiplier vector (i.e. the solution) is a significant problem for the effective use of these classifiers. In order to overcome this lack of sparseness, we propose both single and multi-objective GA approaches to leave a few support vectors out of the solution without affecting the classifier's accuracy and even improving it. The main idea is to leave out outliers, non-relevant patterns or those ones which can be corrupted with noise and thus prevent classifiers to achieve higher accuracies along with a reduced set of support vectors. Differently from previous works, genetic algorithms are used in this work to obtain sparseness not to find out the optimal values of the LSSVM hyper-parameters. (C) 2015 Elsevier B.V. All rights reserved. 2015 * 31(<-586): Additive preference model with piecewise linear components resulting from Dominance-based Rough Set Approximations Dominance-based Rough Set Approach (DRSA) has been proposed for multi-criteria classification problems in order to handle inconsistencies in the input information with respect to the dominance principle. The end result of DRSA is a decision rule model of Decision Maker preferences. In this paper, we consider an additive function model resulting from dominance-based rough approximations. The presented approach is similar to UTA and UTADTS methods. However, we define a goal function of the optimization problem in a similar way as it is done in Support Vector Machines (SVM). The problem may. also be defined as the one of searching for linear value functions in a transformed feature space obtained by exhaustive binarization of criteria. 2006 * 32(<-120): A niching genetic programming-based multi-objective algorithm for hybrid data classification This paper introduces a multi-objective algorithm based on genetic programming to extract classification rules in databases composed of hybrid data, i.e., regular (e.g. numerical, logical, and textual) and non-regular (e.g. geographical) attributes. This algorithm employs a niche technique combined with a population archive in order to identify the rules that are more suitable for classifying items amongst classes of a given data set. The algorithm is implemented in such a way that the user can choose the function set that is more adequate for a given application. This feature makes the proposed approach virtually applicable to any kind of data set classification problem. Besides, the classification problem is modeled as a multi-objective one, in which the maximization of the accuracy and the minimization of the classifier complexity are considered as the objective functions. A set of different classification problems, with considerably different data sets and domains, has been considered: wines, patients with hepatitis, incipient faults in power transformers and level of development of cities. In this last data set, some of the attributes are geographical, and they are expressed as points, lines or polygons. The effectiveness of the algorithm has been compared with three other methods, widely employed for classification: Decision Tree (C4.5), Support Vector Machine (SVM) and Radial Basis Function (RBF). Statistical comparisons have been conducted employing one-way ANOVA and Tukey's tests, in order to provide reliable comparison of the methods. The results show that the proposed algorithm achieved better classification effectiveness in all tested instances, what suggests that it is suitable for a considerable range of classification applications. (C) 2014 Elsevier B.V. All rights reserved. 2014 * 37(<- 64): Surrogate-assisted multi-objective model selection for support vector machines Classification is one of the most well-known tasks in supervised learning. A vast number of algorithms for pattern classification have been proposed so far. Among these, support vector machines (SVMs) are one of the most popular approaches, due to the high performance reached by these methods in a wide number of pattern recognition applications. Nevertheless, the effectiveness of SVMs highly depends on their hyper-parameters. Besides the fine-tuning of their hyper-parameters, the way in which the features are scaled as well as the presence of non-relevant features could affect their generalization performance. This paper introduces an approach for addressing model selection for support vector machines used in classification tasks. In our formulation, a model can be composed of feature selection and pre-processing methods besides the SVM classifier. We formulate the model selection problem as a multi-objective one, aiming to minimize simultaneously two components that are closely related to the error of a model: bias and variance components, which are estimated in an experimental fashion. A surrogate-assisted evolutionary multi-objective optimization approach is adopted to explore the hyper-parameters space. We adopted this approach due to the fact that estimating the bias and variance could be computationally expensive. Therefore, by using surrogate-assisted optimization, we expect to reduce the number of solutions evaluated by the fitness functions so that the computational cost would also be reduced. Experimental results conducted on benchmark datasets widely used in the literature, indicate that highly competitive models with a fewer number of fitness function evaluations are obtained by our proposal when it is compared to state of the art model selection methods. (C) 2014 Elsevier B.V. All rights reserved. 2015 * 40(<-467): AG-ART: An adaptive approach to evolving ART architectures This paper focuses on classification problems, and in particular on the evolution of ARTMAP architectures using genetic algorithms, with the objective of improving generalization performance and alleviating the adaptive resonance theory (ART) category proliferation problem. In a previous effort, we introduced evolutionary fuzzy ARTMAP (FAM), referred to as genetic Fuzzy ARTMAP (GFAM). In this paper we apply an improved genetic algorithm to FAM and extend these ideas to two other ART architectures; ellipsoidal ARTMAP (EAM) and Gaussian ARTMAP (CAM). One of the major advantages of the proposed improved genetic algorithm is that it adapts the CA parameters automatically, and in a way that takes into consideration the intricacies of the classification problem under consideration. The resulting genetically engineered ART architectures are justifiably referred to as AG-FAM, AG-EAM and AG-GAM or collectively as AG-ART (adaptive genetically engineered ART). We compare the performance (in terms of accuracy, size, and computational cost) of the AG-ART architectures with GFAM, and other ART architectures that have appeared in the literature and attempted to solve the category proliferation problem. Our results demonstrate that AG-ART architectures exhibit better performance than their other ART counterparts (semi-supervised ART) and better performance than GFAM. We also compare AG-ART's performance to other related results published in the classification literature, and demonstrate that AG-ART architectures exhibit competitive generalization performance and, quite often, produce smaller size classifiers in solving the same classification problems. We also show that AG-ART's performance gains are achieved within a reasonable computational budget. (C) 2008 Elsevier B.V. All rights reserved. 2009 * 41(<-580): Multi-objective parameters selection for SVM classification using NSGA-II Selecting proper parameters is an important issue to extend the classification ability of Support Vector Machine (SVM), which makes SVM practically useful. Genetic Algorithm (CA) has been widely applied to solve the problem of parameters selection for SVM classification due to its ability to discover good solutions quickly for complex searching and optimization problems. However, traditional CA in this field relys on single generalization error bound as fitness function to select parameters. Since there have several generalization error bounds been developed, picking and using single criterion as fitness function seems intractable and insufficient. Motivated by the multi-objective optimization problems, this paper introduces an efficient method of parameters selection for SVM classification based on multi-objective evolutionary algorithm NSCA-II. We also introduce an adaptive mutation rate for NSGA-II. Experiment results show that our method is better than single-objective approaches, especially in the case of tiny training sets with large testing sets. 2006 * 42(<-589): Multiobjective optimization of ensembles of multilayer perceptrons for pattern classification Pattern classification seeks to minimize error of unknown patterns, however, in many real world applications, type I (false positive) and type II (false negative) errors have to be dealt with separately, which is a complex problem since an attempt to minimize one of them usually makes the other grow. Actually, a type of error can be more important than the other, and a trade-off that minimizes the most important error type must be reached. Despite the importance of type-II errors, most pattern classification methods take into account only the global classification error. In this paper we propose to optimize both error types in classification by means of a multiobjective algorithm in which each error type and the network size is an objective of the fitness function. A modified version of the GProp method (optimization and design of multilayer perceptrons) is used, to simultaneously optimize the network size and the type I and II errors. 2006 * 43(<-238): Multiplicative Update Rules for Concurrent Nonnegative Matrix Factorization and Maximum Margin Classification The state-of-the-art classification methods which employ nonnegative matrix factorization (NMF) employ two consecutive independent steps. The first one performs data transformation (dimensionality reduction) and the second one classifies the transformed data using classification methods, such as nearest neighbor/centroid or support vector machines (SVMs). In the following, we focus on using NMF factorization followed by SVM classification. Typically, the parameters of these two steps, e. g., the NMF bases/coefficients and the support vectors, are optimized independently, thus leading to suboptimal classification performance. In this paper, we merge these two steps into one by incorporating maximum margin classification constraints into the standard NMF optimization. The notion behind the proposed framework is to perform NMF, while ensuring that the margin between the projected data of the two classes is maximal. The concurrent NMF factorization and support vector optimization are performed through a set of multiplicative update rules. In the same context, the maximum margin classification constraints are imposed on the NMF problem with additional discriminant constraints and respective multiplicative update rules are extracted. The impact of the maximum margin classification constraints on the NMF factorization problem is addressed in Section VI. Experimental results in several databases indicate that the incorporation of the maximum margin classification constraints into the NMF and discriminant NMF objective functions improves the accuracy of the classification. 2013 * 44(<-425): A multi-model selection framework for unknown and/or evolutive misclassification cost problems In this paper, we tackle the problem of model selection when misclassification costs are unknown and/or may evolve. Unlike traditional approaches based on a scalar optimization, we propose a generic multimodel selection framework based on a multi-objective approach. The idea is to automatically train a pool of classifiers instead of one single classifier, each classifier in the pool optimizing a particular trade-off between the objectives. Within the context of two-class classification problems, we introduce the "ROC front concept" as an alternative to the ROC curve representation. This strategy is applied to the multimodel selection of SVM classifiers using an evolutionary multi-objective optimization algorithm. The comparison with a traditional scalar optimization technique based on an AUC criterion shows promising results on UCl datasets as well as on a real-world classification problem. (C) 2009 Elsevier Ltd. All rights reserved. 2010 * 45(<-429): NONCOST SENSITIVE SVM TRAINING USING MULTIPLE MODEL SELECTION In this paper, we propose a multi-objective optimization framework for SVM hyperparameters tuning. The key idea is to manage a population of classifiers optimizing both False Positive and True Positive rates rather than a single classifier optimizing a scalar criterion. Hence, each classifier in the population optimizes a particular trade-off between the objectives. Within the context of two-class classification problems, our work introduces "the receiver operating characteristics (ROC) front concept" depicting a population of SVM classifiers as an alternative to the receiver operating characteristics (ROC) curve representation. The proposed framework leads to a noncost sensitive SVM training relying on the pool of classifiers. The comparison with a traditional scalar optimization technique based on an AUC criterion shows promising results on UCI datasets. 2010 * 46(<-567): Two-group classification via a biobjective margin maximization model In this paper we propose a biobjective model for two-group classification via margin maximization, in which the margins in both classes are simultaneously maximized. The set of Pareto-optimal solutions is described, yielding a set of parallel hyperplanes, one of which is just the solution of the classical SVM approach. In order to take into account different misclassification costs or a priori probabilities, the ROC curve can be used to select one out of such hyperplanes by expressing the adequate tradeoff for sensitivity and specificity. Our result gives a theoretical motivation for using the ROC approach in case misclassification costs in the two groups are not necessarily equal. (c) 2005 Elsevier B.V. All rights reserved. 2006 * 47(<-573): Multi-class ROC analysis from a multi-objective optimisation perspective The receiver operating characteristic (ROC) has become a standard tool for the analysis and comparison of classifiers when the costs of misclassification are unknown. There has been relatively little work, however, examining ROC for more than two classes. Here we discuss and present an extension to the standard two-class ROC for multi-class problems. We define the ROC surface for the Q-class problem in terms of a multi-objective optimisation problem in which the goal is to simultaneously minimise the Q(Q-1) misclassification rates, when the misclassification costs and parameters governing the classifier's behaviour are unknown. We present an evolutionary algorithm to locate the Pareto front-the optimal trade-off surface between misclassifications of different types. The use of the Pareto optimal surface to compare classifiers is discussed and we present a straightforward multi-class analogue of the Gini coefficient. The performance of the evolutionary algorithm is illustrated on a synthetic three class problem, for both k-nearest neighbour and multi-layer perceptron classifiers. (c) 2005 Elsevier B.V. All rights reserved. 2006 * 48(<- 26): Joint model for feature selection and parameter optimization coupled with classifier ensemble in chemical mention recognition Mention recognition in chemical texts plays an important role in a wide-spread range of application areas. Feature selection and parameter optimization are the two important issues in machine learning. While the former improves the quality of a classifier by removing the redundant and irrelevant features, the later concerns finding the most suitable parameter values, which have significant impact on the overall classification performance. In this paper we formulate a joint model that performs feature selection and parameter optimization simultaneously, and propose two approaches based on the concepts of single and multiobjective optimization techniques. Classifier ensemble techniques are also employed to improve the performance further. We identify and implement variety of features that are mostly domain-independent. Experiments are performed with various configurations on the benchmark patent and Medline datasets. Evaluation shows encouraging performance in all the settings. (C) 2015 Elsevier B.V. All rights reserved. 2015 * 53(<- 44): The influence of scaling metabolomics data on model classification accuracy Correctly measured classification accuracy is an important aspect not only to classify pre-designated classes such as disease versus control properly, but also to ensure that the biological question can be answered competently. We recognised that there has been minimal investigation of pre-treatment methods and its influence on classification accuracy within the metabolomics literature. The standard approach to pre-treatment prior to classification modelling often incorporates the use of methods such as autoscaling, which positions all variables on a comparable scale thus allowing one to achieve separation of two or more groups (target classes). This is often undertaken without any prior investigation into the influence of the pre-treatment method on the data and supervised learning techniques employed. Whilst this is useful for deriving essential information such as predictive ability or visual interpretation in many cases, as shown in this study the standard approach is not always the most suitable option available. Here, a study has been conducted to investigate the influence of six pre-treatment methods-autoscaling, range, level, Pareto and vast scaling, as well as no scaling-on four classification models, including: principal components-discriminant function analysis (PC-DFA), support vector machines (SVM), random forests (RF) and k-nearest neighbours (kNN)-using three publically available metabolomics data sets. We have demonstrated that undertaking different pre-treatment methods can greatly affect the interpretation of the statistical modelling outputs. The results have shown that data pre-treatment is context dependent and that there was no single superior method for all the data sets used. Whilst we did find that vast scaling produced the most robust models in terms of classification rate for PC-DFA of both NMR spectroscopy data sets, in general we conclude that both vast scaling and autoscaling produced similar and superior results in comparison to the other four pre-treatment methods on both NMR and GC-MS data sets. It is therefore our recommendation that vast scaling is the primary pre-treatment method to use as this method appears to be more stable and robust across all the different classifiers that were conducted in this study. 2015 * 54(<-127): SVM classification for imbalanced data sets using a multiobjective optimization framework Classification of imbalanced data sets in which negative instances outnumber the positive instances is a significant challenge. These data sets are commonly encountered in real-life problems. However, performance of well-known classifiers is limited in such cases. Various solution approaches have been proposed for the class imbalance problem using either data-level or algorithm-level modifications. Support Vector Machines (SVMs) that have a solid theoretical background also encounter a dramatic decrease in performance when the data distribution is imbalanced. In this study, we propose an L-1-norm SVM approach that is based on a three objective optimization problem so as to incorporate into the formulation the error sums for the two classes independently. Motivated by the inherent multi objective nature of the SVMs, the solution approach utilizes a reduction into two criteria formulations and investigates the efficient frontier systematically. The results indicate that a comprehensive treatment of distinct positive and negative error levels may lead to performance improvements that have varying degrees of increased computational effort. 2014 * 66(<-498): Accurate and resource-aware classification based on measurement data In this paper, we face the problem of designing accurate decision-making modules in measurement systems that need to be implemented on resource-constrained platforms. We propose a methodology based on multiobjective optimization and genetic algorithms (GAs) for the analysis of support vector machine (SVM) solutions in the classification error-complexity space. Specific criteria for the choice of optimal SVM classifiers and experimental results on both real and synthetic data will also be discussed. 2008 * 69(<-369): Classification as Clustering: A Pareto Cooperative-Competitive GP Approach Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases. 2011 * 71(<-411): Integrating Clustering and Supervised Learning for Categorical Data Analysis The problem of fuzzy clustering of categorical data, where no natural ordering among the elements of a categorical attribute domain can be found, is an important problem in exploratory data analysis. As a result, a few clustering algorithms with focus on categorical data have been proposed. In this paper, a modified differential evolution (DE)-based fuzzy c-medoids (FCMdd) clustering of categorical data has been proposed. The algorithm combines both local as well as global information with adaptive weighting. The performance of the proposed method has been compared with those using genetic algorithm, simulated annealing, and the classical DE technique, besides the FCMdd, fuzzy k-modes, and average linkage hierarchical clustering algorithm for four artificial and four real life categorical data sets. Statistical test has been carried out to establish the statistical significance of the proposed method. To improve the result further, the clustering method is integrated with a support vector machine (SVM), a well-known technique for supervised learning. A fraction of the data points selected from different clusters based on their proximity to the respective medoids is used for training the SVM. The clustering assignments of the remaining points are thereafter determined using the trained classifier. The superiority of the integrated clustering and supervised learning approach has been demonstrated. 2010 * 77(<- 79): Pareto-Path Multitask Multiple Kernel Learning A traditional and intuitively appealing Multitask Multiple Kernel Learning (MT-MKL) method is to optimize the sum (thus, the average) of objective functions with (partially) shared kernel function, which allows information sharing among the tasks. We point out that the obtained solution corresponds to a single point on the Pareto Front (PF) of a multiobjective optimization problem, which considers the concurrent optimization of all task objectives involved in the Multitask Learning (MTL) problem. Motivated by this last observation and arguing that the former approach is heuristic, we propose a novel support vector machine MT-MKL framework that considers an implicitly defined set of conic combinations of task objectives. We show that solving our framework produces solutions along a path on the aforementioned PF and that it subsumes the optimization of the average of objective functions as a special case. Using the algorithms we derived, we demonstrate through a series of experimental results that the framework is capable of achieving a better classification performance, when compared with other similar MTL approaches. 2015 * 82(<-330): Integrating multicriteria PROMETHEE II method into a single-layer perceptron for two-class pattern classification PROMETHEE methods based on the outranking relation theory are extensively used in multicriteria decision aid. A preference index representing the intensity of preference for one pattern over another pattern can be measured by various preference functions. The higher the intensity, the stronger the preference is indicated. In contrast to traditional single-layer perceptrons (SLPs) with the sigmoid function, this paper develops a novel PROMETHEE II-based SLP using concepts from the PROMETHEE II method involving pairwise comparisons between patterns. The assignment of a class label to a pattern is dependent on its net preference index, which the proposed perceptron obtains. Specially, this study designs a genetic-algorithm-based learning algorithm to determine the relative weights of respective criteria in order to derive the preference index for any pair of patterns. Computer simulations involving several real-world data sets reveal the classification performance of the proposed PROMETHEE II-based SLP. The proposed perceptron performs well compared to the other well-known fuzzy or non-fuzzy classification methods. 2011 * 84(<-399): A single-layer perceptron with PROMETHEE methods using novel preference indices The Preference Ranking Organization METHods for Enrichment Evaluations (PROMETHEE) methods, based on the outranking relation theory, are used extensively in multi-criteria decision aid (MCDA). In particular, preference indices with weighted average aggregation representing the intensity of preference for one pattern over another pattern are measured by various preference functions. The higher the intensity, the stronger the preference is indicated. For MCDA, to obtain the ranking of alternatives, compromise operators such as the weighted average aggregation, or the disjunctive operators are often employed to aggregate the performance values of criteria. The compromise operators express the group utility or the majority rule, whereas the disjunctive operators take into account the strongly opponent or agreeable minorities. Since these two types of operators have their own unique features, it is interesting to develop a novel aggregator by integrating them into a single aggregator for a preference index. This study aims to develop a novel PROMETHEE-based single-layer perceptron (PROSLP) for pattern classification using the proposed preference index. The assignment of a class label to a pattern is dependent on its net preference index, which is obtained by the proposed perceptron. Computer simulations involving several real-world data sets reveal the classification performance of the proposed PROMETHEE-based SLP. The proposed perceptron with the novel preference index performs well compared to that with the original one. (C) 2010 Elsevier B.V. All rights reserved. 2010 * 117(<- 73): A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms. 2015 * 133(<-135): Nonadditive similarity-based single-layer perceptron for multi-criteria collaborative filtering The main aim of the popular collaborative filtering approaches for recommender systems is to recommend items that users with similar preferences have liked in the past. Although single-criterion recommender systems have been successfully used in several applications, multi-criteria rating systems that allow users to specify ratings for various content attributes for individual items are gaining in importance. To measure the overall similarity between any two users for multi-criteria collaborative filtering, the indifference relation in outranking relation theory, which can justify discrimination between any two patterns, is suitable for multi-criteria decision making (MCDM). However, nonadditive indifference indices that address interactions among criteria should be taken into account. This paper proposes a novel similarity-based perceptron using nonadditive indifference indices to estimate an overall rating that a user would give to a specific item. The applicability of the proposed model to recommendation of initiators on a group-buying website was examined. Experimental results demonstrate that the proposed model performs well in terms of generalization ability compared to other multi-criteria collaborative filtering approaches. (C) 2013 Elsevier B.V. All rights reserved. 2014 * 137(<-275): A two-stage evolutionary algorithm based on sensitivity and accuracy for multi-class problems The machine learning community has traditionally used correct classification rates or accuracy (C) values to measure classifier performance and has generally avoided presenting classification levels of each class in the results, especially for problems with more than two classes. C values alone are insufficient because they cannot capture the myriad of contributing factors that differentiate the performance of two different classifiers. Receiver Operating Characteristic (ROC) analysis is an alternative to solve these difficulties, but it can only be used for two-class problems. For this reason, this paper proposes a new approach for analysing classifiers based on two measures: C and sensitivity (S) (i.e., the minimum of accuracies obtained for each class). These measures are optimised through a two-stage evolutionary process. It was conducted by applying two sequential fitness functions in the evolutionary process, including entropy (E) for the first stage and a new fitness function, area (A), for the second stage. By using these fitness functions, the C level was optimised in the first stage, and the S value of the classifier was generally improved without significantly reducing C in the second stage. This two-stage approach improved S values in the generalisation set (whereas an evolutionary algorithm (EA) based only on the S measure obtains worse S levels) and obtained both high C values and good classification levels for each class. The methodology was applied to solve 16 benchmark classification problems and two complex real-world problems in analytical chemistry and predictive microbiology. It obtained promising results when compared to other competitive multiclass classification algorithms and a multi-objective alternative based on E and S. (C) 2012 Elsevier Inc. All rights reserved. 2012 * 141(<-398): Brain-Computer Evolutionary Multiobjective Optimization: A Genetic Algorithm Adapting to the Decision Maker The centrality of the decision maker (DM) is widely recognized in the multiple criteria decision-making community. This translates into emphasis on seamless human-computer interaction, and adaptation of the solution technique to the knowledge which is progressively acquired from the DM. This paper adopts the methodology of reactive search optimization (RSO) for evolutionary interactive multiobjective optimization. RSO follows to the paradigm of "learning while optimizing," through the use of online machine learning techniques as an integral part of a self-tuning optimization scheme. User judgments of couples of solutions are used to build robust incremental models of the user utility function, with the objective to reduce the cognitive burden required from the DM to identify a satisficing solution. The technique of support vector ranking is used together with a k-fold cross-validation procedure to select the best kernel for the problem at hand, during the utility function training procedure. Experimental results are presented for a series of benchmark problems. 2010 * 144(<- 7): Multiple criteria decision aiding for finance: An updated bibliographic survey Finance is a popular field for applied and methodological research involving multiple criteria decision aiding (MCDA) techniques. In this study we present an up-to-date bibliographic survey of the contributions of MCDA in financial decision making, focusing on the developments during the past decade. The survey covers all main areas of financial modeling as well as the different methodological approaches in MCDA and its connections with other analytical fields. On the basis of the survey results, we discuss the contributions of MCDA in different areas of financial decision making and identify established and emerging research topics, as well as future opportunities and challenges. (C) 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS). All rights reserved. 2015 * 146(<-365): Preference disaggregation and statistical learning for multicriteria decision support: A review Disaggregation methods have become popular in multicriteria decision aiding (MCDA) for eliciting preferential information and constructing decision models from decision examples. From a statistical point of view, data mining and machine learning are also involved with similar problems, mainly with regard to identifying patterns and extracting knowledge from data. Recent research has also focused on the introduction of specific domain knowledge in machine learning algorithms. Thus, the connections between disaggregation methods in MCDA and traditional machine learning tools are becoming stronger. In this paper the relationships between the two fields are explored. The differences and similarities between the two approaches are identified, and a review is given regarding the integration of the two fields. (C) 2010 Elsevier B.V. All rights reserved. 2011 * 149(<-478): A memetic model of evolutionary PSO for computational finance applications Motivated by the compensatory property of EA and PSO, where the latter can enhance solutions generated from the evolutionary operations by exploiting their individual memory and social knowledge of the swarm, this paper examines the implementation of PSO as a local optimizer for fine tuning in evolutionary search. The proposed approach is evaluated on applications from the field of computational finance, namely portfolio optimization and time series forecasting. Exploiting the structural similarity between these two problems and the non-linear fractional knapsack problem. an instance of the latter is generalized and implemented as the preliminary test platform for the proposed EA-PSO hybrid model. The experimental results demonstrate the positive effects of this memetic synergy and reveal general design guidelines for the implementation of PSO as a local optimizer. Algorithmic performance improvements are similarly evident when extending to the real-world optimization problems under the appropriate integration of PSO with EA. (C) 2008 Elsevier Ltd. All rights reserved. 2009 * 155(<-680): PERCEPTRONS PLAY THE REPEATED PRISONERS-DILEMMA We examine the implications of bounded rationality in repeated games by modeling the repeated game strategies as perceptrons (F. Rosenblatt, ''Principles of Neurodynamics,'' Spartan Books, and M. Minsky and S. A. Papert, ''Perceptions: An Introduction to Computational Geometry,'' MIT Press, Cambridge, MA, 1988). In the prisoner's dilemma game, if the cooperation outcome is Pareto efficient, then we can establish the folk theorem by perceptrons with single associative units (Minsky and Papert), whose computational capability barely exceeds what we would expect from players capable of fictitious plays (e.g., L. Shapley, Some topics in two-person games, Adv. Game Theory 5 (1964), 1-28). (C) 1995 Academic Press, Inc. 1995 * 156(<-206): Genetic Algorithms, a Nature-Inspired Tool: A Survey of Applications in Materials Science and Related Fields: Part II Genetic algorithms (GAs) are a helpful tool in optimization, simulation, modelling, design, and prediction purposes in various domains of science including materials science, medicine, technology, economy, industry, environment protection, etc. Reported uses of GAs led to solving of numerous complex computational tasks. In materials science and related fields of science and technology, GAs are routinely used for materials modeling and design, for optimization of material properties, the method is also useful in organizing the material or device production at the industrial scale. Here, the most recent (years 2008-2012) applications of GAs in materials science and in related fields (solid state physics and chemistry, crystallography, production, and engineering) are reviewed. The representative examples selected from recent literature show how broad is the usefulness of this computational method. 2013 * 161(<-625): Developing sorting models using preference disaggregation analysis: An experimental investigation Within the field of multicriteria decision aid, sorting refers to the assignment of a set of alternatives into predefined homogenous groups defined in an ordinal way. The real-world applications of this type of problem extend to a wide range of decision-making fields. Preference disaggregation analysis provides the framework for developing sorting models through the analysis of the global judgment of the decision-maker using mathematical programming techniques. However, the automatic elicitation of preferential information through the preference disaggregation analysis raises several issues regarding the impact of the parameters involved in the model development process on the performance and the stability of the developed models. The objective of this paper is to shed light on this issue. For this purpose the UTADIS preference disaggregation sorting method (UTilites Additives DIScriminantes) is considered. The conducted analysis is based on an extensive Monte Carlo simulation and useful findings are obtained on the aforementioned issues. (C) 2003 Elsevier B.V. All rights reserved. 2004 * 163(<- 88): Pareto Front Estimation for Decision Making The set of available multi-objective optimisation algorithms continues to grow. This fact can be partially attributed to their widespread use and applicability. However, this increase also suggests several issues remain to be addressed satisfactorily. One such issue is the diversity and the number of solutions available to the decision maker (DM). Even for algorithms very well suited for a particular problem, it is difficult-mainly due to the computational cost-to use a population large enough to ensure the likelihood of obtaining a solution close to the DM's preferences. In this paper we present a novel methodology that produces additional Pareto optimal solutions from a Pareto optimal set obtained at the end run of any multi-objective optimisation algorithm for two-objective and three-objective problem instances. 2014 * 164(<-306): Memetic algorithms and memetic computing optimization: A literature review Memetic computing is a subject in computer science which considers complex structures such as the combination of simple agents and memes, whose evolutionary interactions lead to intelligent complexes capable of problem-solving. The founding cornerstone of this subject has been the concept of memetic algorithms, that is a class of optimization algorithms whose structure is characterized by an evolutionary framework and a list of local search components. This article presents a broad literature review on this subject focused on optimization problems. Several classes of optimization problems, such as discrete, continuous, constrained, multi-objective and characterized by uncertainties, are addressed by indicating the memetic "recipes" proposed in the literature. In addition, this article focuses on implementation aspects and especially the coordination of memes which is the most important and characterizing aspect of a memetic structure. Finally, some considerations about future trends in the subject are given. (C) 2011 Elsevier B.V. All rights reserved. 2012 * 165(<-511): Pareto-based multiobjective machine learning: An overview and case studies Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggregated to a scalar cost function. This can be mainly attributed to the fact that most conventional learning algorithms can only deal with a scalar cost function. Over the last decade, efforts on solving machine learning problems using the Pareto-based multiobjective optimization methodology have gained increasing impetus, particularly due to the great success of multiobjective optimization using evolutionary algorithms and other population-based stochastic search methods. It has been shown that Pareto-based multiobjective learning approaches are more powerful compared to learning algorithms with a scalar cost function in addressing various topics of machine learning, such as clustering, feature selection, improvement of generalization ability, knowledge extraction, and ensemble generation. One common benefit of the different multiobjective learning approaches is that a deeper insight into the learning problem can be gained by analyzing the Pareto front composed of multiple Pareto-optimal solutions. This paper provides an overview of the existing research on multiobjective machine learning, focusing on supervised learning. In addition, a number of case studies are provided to illustrate the major benefits of the Pareto-based approach to machine learning, e.g., how to identify interpretable models and models that can generalize on unseen data from the obtained Pareto-optimal solutions. Three approaches to Pareto-based multiobjective ensemble generation are compared and discussed in detail. Finally, potentially interesting topics in multiobjective machine learning are suggested. 2008 * 167(<-126): Parameter identification and calibration of the Xin'anjiang model using the surrogate modeling approach Practical experience has demonstrated that single objective functions, no matter how carefully chosen, prove to be inadequate in providing proper measurements for all of the characteristics of the observed data. One strategy to circumvent this problem is to define multiple fitting criteria that measure different aspects of system behavior, and to use multi-criteria optimization to identify non-dominated optimal solutions. Unfortunately, these analyses require running original simulation models thousands of times. As such, they demand prohibitively large computational budgets. As a result, surrogate models have been used in combination with a variety of multi-objective optimization algorithms to approximate the true Pareto-front within limited evaluations for the original model. In this study, multi-objective optimization based on surrogate modeling (multivariate adaptive regression splines, MARS) for a conceptual rainfall-runoff model (Xin'anjiang model, XAJ) was proposed. Taking the Yanduhe basin of Three Gorges in the upper stream of the Yangtze River in China as a case study, three evaluation criteria were selected to quantify the goodness-of-fit of observations against calculated values from the simulation model. The three criteria chosen were the Nash-Sutcliffe efficiency coefficient, the relative error of peak flow, and runoff volume (REPF and RERV). The efficacy of this method is demonstrated on the calibration of the XAJ model. Compared to the single objective optimization results, it was indicated that the multi-objective optimization method can infer the most probable parameter set. The results also demonstrate that the use of surrogate-modeling enables optimization that is much more efficient; and the total computational cost is reduced by about 92.5%, compared to optimization without using surrogate modeling. The results obtained with the proposed method support the feasibility of applying parameter optimization to computationally intensive simulation models, via reducing the number of simulation runs required in the numerical model considerably. 2014 * 169(<-268): Multiresponse Metamodeling in Simulation-Based Design Applications The optimal design of complex systems in engineering requires the availability of mathematical models of system's behavior as a function of a set of design variables; such models allow the designer to search for the best solution to the design problem. However, system models (e.g., computational fluid dynamics (CFD) analysis, physical prototypes) are usually time-consuming and expensive to evaluate, and thus unsuited for systematic use during design. Approximate models of system behavior based on limited data, also known as metamodels, allow significant savings by reducing the resources devoted to modeling during the design process. In this work in engineering design based on multiple performance criteria, we propose the use of multi-response Bayesian surrogate models (MR-BSM) to model several aspects of system behavior jointly, instead of modeling each individually. To this end, we formulated a family of multiresponse correlation functions, suitable for prediction of several response variables that are observed simultaneously from the same computer simulation. Using a set of test functions with varying degrees of correlation, we compared the performance of MR-BSM against metamodels built individually for each response. Our results indicate that MR-BSM outperforms individual metamodels in 53% to 75% of the test cases, though the relative performance depends on the sample size, sampling scheme and the actual correlation among the observed response values. In addition, the relative performance of MR-BSM versus individual metamodels was contingent upon the ability to select an appropriate covariance/correlation function for each application, a task for which a modified version of Akaike's Information Criterion was observed to be inadequate. [DOI: 10.1115/1.4006996] 2012 * 170(<-428): Multiobjective global surrogate modeling, dealing with the 5-percent problem When dealing with computationally expensive simulation codes or process measurement data, surrogate modeling methods are firmly established as facilitators for design space exploration, sensitivity analysis, visualization, prototyping and optimization. Typically the model parameter (=hyperparameter) optimization problem as part of global surrogate modeling is formulated in a single objective way. Models are generated according to a single objective (accuracy). However, this requires an engineer to determine a single accuracy target and measure upfront, which is hard to do if the behavior of the response is unknown. Likewise, the different outputs of a multi-output system are typically modeled separately by independent models. Again, a multiobjective approach would benefit the domain expert by giving information about output correlation and enabling automatic model type selection for each output dynamically. With this paper the authors attempt to increase awareness of the subtleties involved and discuss a number of solutions and applications. In particular, we present a multiobjective framework for global surrogate model generation to help tackle both problems and that is applicable in both the static and sequential design (adaptive sampling) case. 2010 * 211(<-324): Mobility Timing for Agent Communities, a Cue for Advanced Connectionist Systems We introduce a wait-and-chase scheme that models the contact times between moving agents within a connectionist construct. The idea that elementary processors move within a network to get a proper position is borne out both by biological neurons in the brain morphogenesis and by agents within social networks. From the former, we take inspiration to devise a medium-term project for new artificial neural network training procedures where mobile neurons exchange data only when they are close to one another in a proper space (are in contact). From the latter, we accumulate mobility tracks experience. We focus on the preliminary step of characterizing the elapsed time between neuron contacts, which results from a spatial process fitting in the family of random processes with memory, where chasing neurons are stochastically driven by the goal of hitting target neurons. Thus, we add an unprecedented mobility model to the literature in the field, introducing a distribution law of the intercontact times that merges features of both negative exponential and Pareto distribution laws. We give a constructive description and implementation of our model, as well as a short analytical form whose parameters are suitably estimated in terms of confidence intervals from experimental data. Numerical experiments show the model and related inference tools to be sufficiently robust to cope with two main requisites for its exploitation in a neural network: the nonindependence of the observed intercontact times and the feasibility of the model inversion problem to infer suitable mobility parameters. 2011 * 218(<-472): Stochastic sampling design using a multi-objective genetic algorithm and adaptive neural networks This paper presents a novel multi-objective genetic algorithm (MOGA) based on the NSGA-II algorithm, which uses metamodels to determine optimal sampling locations for installing pressure loggers in a water distribution system (WDS) when parameter uncertainty is considered. The new algorithm combines the multi-objective genetic algorithm with adaptive neural networks (MOGA-ANN) to locate pressure loggers. The purpose of pressure logger installation is to collect data for hydraulic model calibration. Sampling design is formulated as a two-objective optimization problem in this study. The objectives are to maximize the calibrated model accuracy and to minimize the number of sampling devices as a surrogate of sampling design cost. Calibrated model accuracy is defined as the average of normalized traces of model prediction covariance matrices, each of which is constructed from a randomly generated sampling set of calibration parameter values. This method of calculating model accuracy is called the 'full' fitness model. Within the genetic algorithm search process, the full fitness model is progressively replaced with the periodically (re)trained adaptive neural network metamodel where (re)training is done using the data collected by calling the full model. The methodology was first tested on a hypothetical (benchmark) problem to configure the setting requirement. Then the model was applied to a real case study. The results show that significant computational savings can be achieved by using the MOGA-ANN when compared to the approach where MOGA is linked to the full fitness model. When applied to the real case study, optimal solutions identified by MOGA-ANN are obtained 25 times faster than those identified by the full model without significant decrease in the accuracy of the final solution. (C) 2008 Elsevier Ltd. All rights reserved. 2009 * 227(<-505): Learning based brain emotional intelligence as a new aspect for development of an alarm system The multi criteria and purposeful prediction approach has been introduced and is implemented by the fast and efficient behavioral based brain emotional learning method. On the other side, the emotional learning from brain model has shown good performance and is characterized by high generalization property. New approach is developed to deal with low computational and memory resources and can be used with the largest available data sets. The scope of paper is to reveal the advantages of emotional learning interpretations of brain as a purposeful forecasting system designed to warning; and to make a fair comparison between the successful neural (MLP) and neurofuzzy (ANFIS) approaches in their best structures and according to prediction accuracy, generalization, and computational complexity. The auroral electrojet (AE) index are used as practical examples of chaotic time series and introduced method used to make predictions and warning of geomagnetic disturbances and geomagnetic storms based on AE index. 2008 * 241(<-405): Neural network ensembles: immune-inspired approaches to the diversity of components This work applies two immune-inspired algorithms, namely opt-aiNet and omni-aiNet, to train multi-layer perceptrons (MLPs) to be used in the construction of ensembles of classifiers. The main goal is to investigate the influence of the diversity of the set of solutions generated by each of these algorithms, and if these solutions lead to improvements in performance when combined in ensembles. omni-aiNet is a multi-objective optimization algorithm and, thus, explicitly maximizes the components' diversity at the same time it minimizes their output errors. The opt-aiNet algorithm, by contrast, was originally designed to solve single-objective optimization problems, focusing on the minimization of the output error of the classifiers. However, an implicit diversity maintenance mechanism stimulates the generation of MLPs with different weights, which may result in diverse classifiers. The performances of opt-aiNet and omni-aiNet are compared with each other and with that of a second-order gradient-based algorithm, named MSCG. The results obtained show how the different diversity maintenance mechanisms presented by each algorithm influence the gain in performance obtained with the use of ensembles. 2010 * 242(<-504): The Q-norm complexity measure and the minimum gradient method: A novel approach to the machine learning structural risk minimization problem This paper presents a novel approach for dealing with the structural risk minimization (SRM) applied to a general setting of the machine learning problem. The formulation is based on the fundamental concept that supervised learning is a bi-objective optimization problem in which two conflicting objectives should be minimized. The objectives are related to the empirical training error and the machine complexity. In this paper, one general Q-norm method to compute the machine complexity is presented, and, as a particular practical case, the minimum gradient method (MGM) is derived relying on the definition of the fat-shattering dimension. A practical mechanism for parallel layer perceptron (PLP) network training, involving only quasi-convex functions, is generated using the aforementioned definitions. Experimental results on 15 different benchmarks are presented, which show the potential of the proposed ideas. 2008 * 243(<-543): Controlling the parallel layer perceptron complexity using a multiobjective learning algorithm This paper deals with the parallel layer perceptron (PLP) complexity control, bias and variance dilemma, using a multiobjective (MOBJ) training algorithm. To control the bias and variance the training process is rewritten as a bi-objective problem, considering the minimization of both training error and norm of the weight vector, which is a measure of the network complexity. This method is, applied to regression and classification problems and compared with several other training procedures and topologies. The results show that the PLP MOBJ training algorithm presents good generalization results, outperforming traditional methods in the tested examples. 2007 * 244(<-548): Improving generalization of MLPs with sliding mode control and the Levenberg-Marquardt algorithm A variation of the well-known Levenberg-Marquardt for training neural networks is proposed in this work. The algorithm presented restricts the norm of the weights vector to a preestablished norm value and finds the minimum error solution for that norm value. The norm constrain controls the neural networks degree of freedom. The more the norm increases, the more flexible is the neural model. Therefore, more fitted to the training set. A range of different norm solutions is generated and the best generalization solution is selected according to the validation set error. The results show the efficiency of the algorithm in terms of generalization performance. (c) 2006 Elsevier B.V. All rights reserved. 2007 * 246(<-560): Many-objective training of a multi-layer perceptron In this paper, a many-objective training scheme for a multi-layer feed-forward neural network is studied. In this scheme, each training data set, or the average over sub-sets of the training data, provides a single objective. A recently proposed group of evolutionary many-objective optimization algorithms based on the NSGA-II algorithm have been examined with respect to the handling of such problem cases. A modified NSGA-II algorithm, using the norm of an individual as a secondary ranking assignment method, appeared to give the best results, even for a large number of objectives (up to 50 in this study). However, there was no notable increase in performance against the standard backpropagation algorithm, and a remarkable drop in performance for higher-dimensional feature spaces (dimension 30 in this study). 2007 * 247(<-645): Training neural networks with a multi-objective sliding mode control algorithm This paper presents a new sliding mode control algorithm that is able to guide the trajectory of a multi-layer perceptron within the plane formed by the two objective functions: training set error and norm of the weight vectors. The results show that the neural networks obtained are able to generate an approximation to the Pareto set, from which an improved generalization performance model is selected. (C) 2002 Elsevier Science B.V. All rights reserved. 2003 * 248(<-661): Recent advances in the MOBJ algorithm for training artificial neural networks. This paper presents a new scheme for training MLPs which employs a relaxation method for multi-objective optimization. The algorithm works by obtaining a reduced set of solutions, from which the one with the best generalization is selected. This approach allows balancing between the training error and norm of network weight vectors, which are the two objective functions of the multi-objective optimization problem. The method is applied to classification and regression problems and compared with Weight Decay (WD), Support Vector Machines (SVMs) and standard Backpropagation (BP). It is shown that the systematic procedure for training proposed results on good generalization neural models, and outperforms traditional methods. 2001 * 249(<-665): Improving generalization of MLPs with multi-objective optimization This paper presents a new learning scheme for improving generalization of multilayer perceptrons. The algorithm uses a multi-objective optimization approach to balance between the error of the training data and the norm of network weight vectors to avoid overfitting. The results are compared with support vector machines and standard backpropagation. (C) 2000 Elsevier Science B.V. All rights reserved. 2000 * 251(<- 85): Time series forecasting by neural networks: A knee point-based multiobjective evolutionary algorithm approach In this paper, we investigate the problem of time series forecasting using single hidden layer feedforward neural networks (SLFNs), which is optimized via multiobjective evolutionary algorithms. By utilizing the adaptive differential evolution (JADE) and the knee point strategy, a nondominated sorting adaptive differential evolution (NSJADE) and its improved version knee point-based NSJADE (KP-NSJADE) are developed for optimizing SLFNs. JADE aiming at refining the search area is introduced in nondominated sorting genetic algorithm II (NSGA-II). The presented NSJADE shows superiority on multimodal problems when compared with NSGA-II. Then NSJADE is applied to train SLFNs for time series forecasting. It is revealed that individuals with better forecasting performance in the whole population gather around the knee point. Therefore, KP-NSJADE is proposed to explore the neighborhood of the knee point in the objective space. And the simulation results of eight popular time series databases illustrate the effectiveness of our proposed algorithm in comparison with several popular algorithms. (C) 2014 Elsevier Ltd. All rights reserved. 2014 * 252(<-124): An analysis of accuracy-diversity trade-off for hybrid combined system with multiobjective predictor selection This study examines the contribution of diversity under a multi-objective context for the promotion of learners in an evolutionary system that generates combinations of partially trained learners. The examined system uses a grammar-driven genetic programming to evolve hierarchical, multi-component combinations of multilayer perceptrons and support vector machines for regression. Two advances are studied. First, a ranking formula is developed for the selection probability of the base learners. This formula incorporates both a diversity measure and the performance of learners, and it is tried over a series of artificial and real-world problems. Results show that when the diversity of a learner is incorporated with equal weights to the learner performance in the evolutionary selection process, the system is able to provide statistically significantly better generalization. The second advance examined is a substitution phase for learners that are over-dominated, under a multi-objective Pareto domination assessment scheme. Results here show that the substitution does not improve significantly the system performance, thus the exclusion of very weak learners, is not a compelling task for the examined framework. 2014 * 255(<-649): Designing a phenotypic distance index for radial basis function neural networks MultiObjective Evolutionary Algorithms (MOEAs) may cause a premature convergence if the selective pressure is too large, so, MOEAs usually incorporate a niche-formation procedure to distribute the population over the optimal solutions and let the population evolve until the Pareto-optimal region is completely explored. This niche-formation scheme is based on a distance index that measures the similarity between two solutions in order to decide if both may share the same niche or not. The similarity criterion is usually based on a Euclidean norm (given that the two solutions are represented with a vector), nevertheless, as this paper will explain, this kind of metric is not adequate for RBFNNs, thus being necessary a more suitable distance index. The experimental results obtained show that a MOEA including the proposed distance index is able to explore sufficiently the Pareto-optimal region and provide the user a wide variety of Pareto-optimal solutions. 2003 * 256(<-658): Hierarchical genetic algorithm for near optimal feedforward neural network design. In this paper, we propose a genetic algorithm based design procedure for a multi layer feed forward neural network. A hierarchical genetic algorithm is used to evolve both the neural networks topology and weighting parameters. Compared with traditional genetic algorithm based designs for neural networks, the hierarchical approach addresses several deficiencies, including a feasibility check highlighted in literature. A multi objective cost function is used herein to optimize the performance and topology of the evolved neural network simultaneously. In the prediction of Mackey Glass chaotic time series, the networks designed by the proposed approach prove to be competitive, or even superior, to traditional learning algorithms for the multi layer Perceptron networks and radial basis function networks. Based upon the chosen cost function, a linear weight combination decision making approach has been applied to derive an approximated Pareto optimal solution set. Therefore, designing a set of neural networks can be considered as solving a two objective optimization problem. 2002 * 257(<-125): Robust parameter design optimization using Kriging, RBF and RBFNN with gradient-based and evolutionary optimization techniques The dual response surface methodology is one of the most commonly used approaches in robust parameter design to simultaneously optimize the mean value and keep the variance minimum. The commonly used meta-model is the quadratic polynomial regression. For highly nonlinear input/output relationship, the accuracy of the fitted model is limited. Many researchers recommended to use more complicated surrogate models. In this study, three surrogate models will replace the second order polynomial regression, namely, ordinary Kriging, radial basis function approximation (RBF) and radial basis function artificial neural network (RBFNN). The results show that the three surrogate model present superior accuracy in comparison with the quadratic polynomial regression. The mean squared error (MSE) approach is widely used to link the mean and variance in one cost function. In this study, a new approach has been proposed using multi-objective optimization. The new approach has two main advantages over the classical method. First, the conflicting nature of the two objectives can be efficiently handled. Second, the decision maker will have a set of Pareto-front design points to select from. (C) 2014 Elsevier Inc. All rights reserved. 2014 * 260(<-453): Parallel multiobjective memetic RBFNNs design and feature selection for function approximation problems The design of radial basis function neural networks (RBFNNs) still remains as a difficult task when they are applied to classification or to regression problems. The difficulty arises when the parameters that define an RBFNN have to be set, these are: the number of RBFs, the position of their centers and the length of their radii. Another issue that has to be faced when applying these models to real world applications is to select the variables that the RBFNN will use as inputs. The literature presents several methodologies to perform these two tasks separately, however, due to the intrinsic parallelism of the genetic algorithms, a parallel implementation will allow the algorithm proposed in this paper to evolve solutions for both problems at the same time. The parallelization of the algorithm not only consists in the evolution of the two problems but in the specialization of the crossover and mutation operators in order to evolve the different elements to be optimized when designing RBFNNs. The subjacent genetic algorithm is the non-sorting dominated genetic algorithm II (NSGA-II) that helps to keep a balance between the size of the network and its approximation accuracy in order to avoid overfitted networks. Another of the novelties of the proposed algorithm is the incorporation of local search algorithms in three stages of the algorithm: initialization of the population, evolution of the individuals and final optimization of the Pareto front. The initialization of the individuals is performed hybridizing clustering techniques with the mutual information (MI) theory to select the input variables. As the experiments will show, the synergy of the different paradigms and techniques combined by the presented algorithm allow to obtain very accurate models using the most significant input variables. (C) 2009 Published by Elsevier B.V. 2009 * 261(<-544): A new hybrid methodology for cooperative-coevolutionary optimization of radial basis function networks This paper presents a new multiobjective cooperative-coevolutive hybrid algorithm for the design of a Radial Basis Function Network (RBFN). This approach codifies a population of Radial Basis Functions (RBFs) (hidden neurons), which evolve by means of cooperation and competition to obtain a compact and accurate RBFN. To evaluate the significance of a given RBF in the whole network, three factors have been proposed: the basis function's contribution to the network's output, the error produced in the basis function radius, and the overlapping among RBFs. To achieve an RBFN composed of RBFs with proper values for these quality factors our algorithm follows a multiobjective approach in the selection process. In the design process, a Fuzzy Rule Based System (FRBS) is used to determine the possibility of applying operators to a certain RBF. As the time required by our evolutionary algorithm to converge is relatively small, it is possible to get a further improvement of the solution found by using a local minimization algorithm (for example, the Levenberg-Marquardt method). In this paper the results of applying our methodology to function approximation and time series prediction problems are also presented and compared with other alternatives proposed in the bibliography. 2007 * 262(<-638): Multiobjective evolutionary optimization of the size, shape, and position parameters of radial basis function networks for function approximation This paper presents a multiobjective evolutionary algorithm to optimize radial basis function neural networks (RBFNNs) in order to approach target functions from a set of input-output pairs. The procedure allows the application of heuristics to improve the solution of the problem at hand by including some new genetic operators in the evolutionary process. These new operators are based on two well-known matrix transformations: singular value decomposition (SVD) and orthogonal least squares (OLS), which have been used to define new mutation operators that produce local or global modifications in the radial basis functions (RBFs) of the networks (the individuals in the population in the evolutionary, procedure). After analyzing the efficiency of the different operators, we have shown that the global mutation operators yield an improved procedure to adjust the parameters of the RBFNNs. 2003 * 263(<-204): Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems This paper presents a new multiobjective evolutionary algorithm applied to a radial basis function (RBF) network design based on multiobjective particle swarm optimization augmented with local search features. The algorithm is named the memetic multiobjective particle swarm optimization RBF network (MPSON) because it integrates the accuracy and structure of an RBF network. The proposed algorithm is implemented on two-class and multiclass pattern classification problems with one complex real problem. The experimental results indicate that the proposed algorithm is viable, and provides an effective means to design multiobjective RBF networks with good generalization capability and compact network structure. The accuracy and complexity of the network obtained by the proposed algorithm are compared with the memetic non-dominated sorting genetic algorithm based RBF network (MGAN) through statistical tests. This study shows that MPSON generates RBF networks coming with an appropriate balance between accuracy and simplicity, outperforming the other algorithms considered. (C) 2013 Elsevier Inc. All rights reserved. 2013 * 265(<-325): Memetic Elitist Pareto Differential Evolution algorithm based Radial Basis Function Networks for classification problems This paper presents a new multi-objective evolutionary hybrid algorithm for the design of Radial Basis Function Networks (RBFNs) for classification problems. The algorithm, MEPDEN, Memetic Elitist Pareto evolutionary approach based on the Non-dominated Sorting Differential Evolution (NSDE) multiobjective evolutionary algorithm which has been adapted to design RBFNs, where the NSDE algorithm is augmented with a local search that uses the Back-propagation algorithm. The MEPDEN is tested on two-class and multiclass pattern classification problems. The results obtained in terms of Mean Square Error (MSE), number of hidden nodes, accuracy (ACC), sensitivity (SEN), specificity (SPE) and Area Under the receiver operating characteristics Curve (AUC), show that the proposed approach is able to produce higher prediction accuracies with much simpler network structures. The accuracy and complexity of the network obtained by the proposed algorithm are compared with Memetic Eilitist Pareto Non-dominated Sorting Genetic Algorithm based RBFN (MEPGAN) through statistical tests. This study showed that MEPDEN obtains RBFNs with an appropriate balance between accuracy and simplicity, outperforming the other method considered. (C) 2011 Elsevier B.V. All rights reserved. 2011 * 266(<-378): Memetic Pareto Evolutionary Artificial Neural Networks to determine growth/no-growth in predictive microbiology The main objective of this work is to automatically design neural network models with sigmoid basis units for binary classification tasks. The classifiers that are obtained achieve a double objective: a high classification level in the dataset and a high classification level for each class. We present MPENSGA2, a Memetic Pareto Evolutionary approach based on the NSGA2 multiobjective evolutionary algorithm which has been adapted to design Artificial Neural Network models, where the NSGA2 algorithm is augmented with a local search that uses the improved Resilient Backpropagation with backtracking-IRprop+ algorithm. To analyze the robustness of this methodology, it was applied to four complex classification problems in predictive microbiology to describe the growth/no-growth interface of food-borne microorganisms such as Listeria monocytogenes, Escherichia coli R31, Staphylococcus aureus and Shigella flexneri. The results obtained in Correct Classification Rate (CCR), Sensitivity (S) as the minimum of sensitivities for each class, Area Under the receiver operating characteristic Curve (AUC), and Root Mean Squared Error (RMSE), show that the generalization ability and the classification rate in each class can be more efficiently improved within a multiobjective framework than within a single-objective framework. (C) 2009 Elsevier B.V. All rights reserved. 2011 * 268(<-414): Sensitivity Versus Accuracy in Multiclass Problems Using Memetic Pareto Evolutionary Neural Networks This paper proposes a multiclassification algorithm using multilayer perceptron neural network models. It tries to boost two conflicting main objectives of multiclassifiers: a high correct classification rate level and a high classification rate for each class. This last objective is not usually optimized in classification, but is considered here given the need to obtain high precision in each class in real problems. To solve this machine learning problem, we use a Pareto-based multiobjective optimization methodology based on a memetic evolutionary algorithm. We consider a memetic Pareto evolutionary approach based on the NSGA2 evolutionary algorithm (MPENSGA2). Once the Pareto front is built, two strategies or automatic individual selection are used: the best model in accuracy and the best model in sensitivity ( extremes in the Pareto front). These methodologies are applied to solve 17 classification benchmark problems obtained from the University of California at Irvine (UCI) repository and one complex real classification problem. The models obtained show high accuracy and a high classification rate for each class. 2010 * 269(<-379): Radial basis function network based on time variant multi-objective particle swarm optimization for medical diseases diagnosis This paper proposes an adaptive evolutionary radial basis function (RBF) network algorithm to evolve accuracy and connections (centers and weights) of RBF networks simultaneously. The problem of hybrid learning of RBF network is discussed with the multi-objective optimization methods to improve classification accuracy for medical disease diagnosis. In this paper, we introduce a time variant multi-objective particle swarm optimization(TVMOPSO) of radial basis function (RBF) network for diagnosing the medical diseases. This study applied RBF network training to determine whether RBF networks can be developed using TVMOPSO, and the performance is validated based on accuracy and complexity. Our approach is tested on three standard data sets from UCI machine learning repository. The results show that our approach is a viable alternative and provides an effective means to solve multi-objective RBF network for medical disease diagnosis. It is better than RBF network based on MOPSO and NSGA-II, and also competitive with other methods in the literature. (C) 2010 Elsevier B.V. All rights reserved. 2011 * 270(<-418): An Adaptive Multiobjective Approach to Evolving ART Architectures In this paper, we present the evolution of adaptive resonance theory (ART) neural network architectures (classifiers) using a multiobjective optimization approach. In particular, we propose the use of a multiobjective evolutionary approach to simultaneously evolve the weights and the topology of three well-known ART architectures; fuzzy ARTMAP (FAM), ellipsoidal ARTMAP (EAM), and Gaussian ARTMAP (GAM). We refer to the resulting architectures as MO-GFAM, MO-GEAM, and MO-GGAM, and collectively as MO-GART. The major advantage of MO-GART is that it produces a number of solutions for the classification problem at hand that have different levels of merit [accuracy on unseen data (generalization) and size (number of categories created)]. MO-GART is shown to be more elegant (does not require user intervention to define the network parameters), more effective (of better accuracy and smaller size), and more efficient (faster to produce the solution networks) than other ART neural network architectures that have appeared in the literature. Furthermore, MO-GART is shown to be competitive with other popular classifiers, such as classification and regression tree (CART) and support vector machines (SVMs). 2010 * 271(<-578): Applications of multi-objective structure optimization We present applications of multi-objective evolutionary optimization of feed-forward neural networks (NN) to two real world problems, car and face classification. The possibly conflicting requirements on the NNs are speed and classification accuracy, both of which can enhance the embedding systems as a whole. We compare the results to the outcome of a greedy optimization heuristic (magnitude-based pruning) coupled with a multi-objective performance evaluation. For the car classification problem, magnitude-based pruning yields competitive results, whereas for the more difficult face classification, we find that the evolutionary approach to NN design is clearly preferable. (c) 2006 Elsevier B.V. All rights reserved. 2006 * 274(<-113): Metrics to guide a multi-objective evolutionary algorithm for ordinal classification Ordinal classification or ordinal regression is a classification problem in which the labels have an ordered arrangement between them. Due to this order, alternative performance evaluation metrics are need to be used in order to consider the magnitude of errors. This paper presents a study of the use of a multi-objective optimization approach in the context of ordinal classification. We contribute a study of ordinal classification performance metrics, and propose a new performance metric, the maximum mean absolute error (MMAE). MMAE considers per-class distribution of patterns and the magnitude of the errors, both issues being crucial for ordinal regression problems. In addition, we empirically show that some of the performance metrics are competitive objectives, which justify the use of multi-objective optimization strategies. In our case, a multi-objective evolutionary algorithm optimizes an artificial neural network ordinal model with different pairs of metric combinations, and we conclude that the pair of the mean absolute error (MAE) and the proposed MMAE is the most favourable. A study of the relationship between the metrics of this proposal is performed, and the graphical representation in the two-dimensional space where the search of the evolutionary algorithm takes place is analysed. The results obtained show a good classification performance, opening new lines of research in the evaluation and model selection of ordinal classifiers. (C) 2014 Elsevier B.V. All rights reserved. 2014 * 276(<-334): Weighting Efficient Accuracy and Minimum Sensitivity for Evolving Multi-Class Classifiers Recently, a multi-objective Sensitivity-Accuracy based methodology has been proposed for building classifiers for multi-class problems. This technique is especially suitable for imbalanced and multi-class datasets. Moreover, the high computational cost of multi-objective approaches is well known so more efficient alternatives must be explored. This paper presents an efficient alternative to the Pareto based solution when considering both Minimum Sensitivity and Accuracy in multi-class classifiers. Alternatives are implemented by extending the Evolutionary Extreme Learning Machine algorithm for training artificial neural networks. Experiments were performed to select the best option after considering alternative proposals and related methods. Based on the experiments, this methodology is competitive in Accuracy, Minimum Sensitivity and efficiency. 2011 * 277(<-561): A cooperative constructive method for neural networks for pattern recognition In this paper, we propose a new constructive method, based on cooperative coevolution, for designing automatically the structure of a neural network for classification. Our approach is based on a modular construction of the neural network by means of a cooperative evolutionary process. This process benefits from the advantages of coevolutionary computation as well as the advantages of constructive methods. The proposed methodology can be easily extended to work with almost any kind of classifier. The evaluation of each module that constitutes the network is made using a multiobjective method. So, each new module can be evaluated in a comprehensive way, considering different aspects, such as performance, complexity, or degree of cooperation with the previous modules of the network. In this way, the method has the advantage of considering not only the performance of the networks, but also other features. The method is tested on 40 classification problems from the UCI machine learning repository with very good performance. The method is thoroughly compared with two other constructive methods, cascade correlation and GMDH networks, and other classification methods, namely, SVM, C4.5, and k nearest-neighbours, and an ensemble of neural networks constructed using four different methods. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. 2007 * 278(<-599): Cooperative coevolution of artificial neural network ensembles for pattern classification This paper presents a cooperative coevolutive approach for designing neural network ensembles. Cooperative coevolution is a recent paradigm in evolutionary computation that allows the effective modeling of cooperative environments. Although theoretically, a single neural network with a sufficient number of neurons in the hidden layer would suffice to solve any problem, in practice many real-world problems are too hard to construct the appropriate network that solve them. In such problems, neural network ensembles are a successful alternative. Nevertheless, the design of neural network ensembles is a complex task. In this paper, we propose a general framework for designing neural network ensembles by means of cooperative coevolution. The proposed model has two main objectives: first, the improvement of the combination of the trained individual networks; second, the cooperative evolution of such networks, encouraging collaboration among them, instead of a separate training of each network. In order to favor the cooperation of the networks, each network is evaluated throughout the evolutionary process using a multiobjective method. For each network, different objectives are defined, considering not only its performance in the given problem, but also its cooperation with the rest of the networks. In addition, a population of ensembles is evolved, improving the combination of networks and obtaining subsets of networks to form ensembles that perform better than the combination of all the evolved networks. The proposed model is applied to ten real-world classification problems of a very different nature from the UCI machine learning repository and proben1 benchmark set. In all of them the performance of the model is better than the performance of standard ensembles in terms of generalization error. Moreover, the size of the obtained ensembles is also smaller. 2005 * 283(<-569): Feature selection for ensembles applied to handwriting recognition Feature selection for ensembles has shown to be an effective strategy for ensemble creation clue to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The underpinning paradigm is the "overproduce and choose". The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts: supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition and used three different feature sets and multi-layer perceptron neural networks as classifiers. In the latter, we took into account the problem of handwritten month word recognition and used three different feature sets and hidden Markov models as classifiers. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with very low error rates. Comparisons have been done by considering the recognition rates only. 2006 * 284(<-210): A new approach to radial basis function-based polynomial neural networks: analysis and design In this study, we introduce a new topology of radial basis function-based polynomial neural networks (RPNNs) that is based on a genetically optimized multi-layer perceptron with radial polynomial neurons (RPNs). This paper offers a comprehensive design methodology involving various mechanisms of optimization, especially fuzzy C-means (FCM) clustering and particle swarm optimization (PSO). In contrast to the typical architectures encountered in polynomial neural networks (PNNs), our main objective is to develop a topology and establish a comprehensive design strategy of RPNNs: (a) The architecture of the proposed network consists of radial polynomial neurons (RPN). These neurons are fully reflective of the structure encountered in numeric data, which are granulated with the aid of FCM clustering. RPN dwells on the concepts of a collection of radial basis function and the function-based nonlinear polynomial processing. (b) The PSO-based design procedure being applied to each layer of the RPNN leads to the selection of preferred nodes of the network whose local parameters (such as the number of input variables, a collection of the specific subset of input variables, the order of the polynomial, the number of clusters of FCM clustering, and a fuzzification coefficient of the FCM method) are properly adjusted. The performance of the RPNN is quantified through a series of experiments where we use several modeling benchmarks, namely a synthetic three-dimensional data and learning machine data (computer hardware data, abalone data, MPG data, and Boston housing data) already used in neuro-fuzzy modeling. A comparative analysis shows that the proposed RPNN exhibits higher accuracy in comparison with some previous models available in the literature. 2013 * 286(<-576): Using a multi-objective genetic algorithm for SVM construction Support Vector Machines are kernel machines useful for classification and regression problems. in this paper, they are used for non-linear regression of environmental data. From a structural point of view, Support Vector Machines are particular Artificial Neural Networks and their training paradigm has some positive implications. in fact, the original training approach is useful to overcome the curse of dimensionality and too strict assumptions on statistics of the errors in data. Support Vector machines and Radial Basis Function Regularised Networks are presented within a common structural framework for non-linear regression in order to emphasise the training strategy for support vector machines and to better explain the multi-objective approach in support vector machines' construction. A support vector machine's performance depends on the kernel parameter, input selection and epsilon-tube optimal dimension. These will be used as decision variables for the evolutionary strategy based on a Genetic Algorithm, which exhibits the number of support vectors, for the capacity of machine, and the fitness to a validation subset, for the model accuracy in mapping the underlying physical phenomena, as objective functions. The strategy is tested on a case study dealing with groundwater modelling, based on time series (past measured rainfalls and levels) for level predictions at variable time horizons. 2006 * 287(<-209): A multi-objective micro genetic ELM algorithm The extreme learning machine (ELM) is a methodology for learning single-hidden layer feedforward neural networks (SLFN) which has been proved to be extremely fast and to provide very good generalization performance. ELM works by randomly choosing the weights and biases of the hidden nodes and then analytically obtaining the output weights and biases for a SLFN with the number of hidden nodes previously fixed. In this work, we develop a multi-objective micro genetic ELM (mu G-ELM) which provides the appropriate number of hidden nodes for the problem being solved as well as the weights and biases which minimize the MSE. The multi-objective algorithm is conducted by two criteria: the number of hidden nodes and the mean square error (MSE). Furthermore, as a novelty, mu G-ELM incorporates a regression device in order to decide whether the number of hidden nodes of the individuals of the population should be increased or decreased or unchanged. In general, the proposed algorithm reaches better errors by also implying a smaller number of hidden nodes for the data sets and competitors considered. (C) 2013 Elsevier B.V. All rights reserved. 2013 * 288(<-424): A multi-objective memetic and hybrid methodology for optimizing the parameters and performance of artificial neural networks The use of artificial neural networks implies considerable time spent choosing a set of parameters that contribute toward improving the final performance. Initial weights, the amount of hidden nodes and layers, training algorithm rates and transfer functions are normally selected through a manual process of trial-and-error that often fails to find the best possible set of neural network parameters for a specific problem. This paper proposes an automatic search methodology for the optimization of the parameters and performance of neural networks relying on use of Evolution Strategies, Particle Swarm Optimization and concepts from Genetic Algorithms corresponding to the hybrid and global search module. There is also a module that refers to local searches, including the well-known Multilayer Perceptrons, Back-propagation and the Levenberg-Marquardt training algorithms. The methodology proposed here performs the search using the aforementioned parameters in an attempt to optimize the networks and performance. Experiments were performed and the results proved the proposed method to be better than trial-and-error and other methods found in the literature. Crown Copyright (C) 2009 Published by Elsevier B.V. All rights reserved. 2010 * 293(<-170): MULTI-OBJECTIVE OPTIMIZATION BY MEANS OF MULTI-DIMENSIONAL MLP NEURAL NETWORKS In this paper, a multi-layer perceptron (MLP) neural network (NN) is put forward as an efficient tool for performing two tasks: 1) optimization of multi-objective problems and 2) solving a non-linear system of equations. In both cases, mathematical functions which are continuous and partially bounded are involved. Previously, these two tasks were performed by recurrent neural networks and also strong algorithms like evolutionary ones. In this study, multi-dimensional structure in the output layer of the MLP-NN, as an innovative method, is utilized to implicitly optimize the multivariate functions under the network energy optimization mechanism. To this end, the activation functions in the output layer are replaced with the multivariate functions intended to be optimized. The effective training parameters in the global search are surveyed. Also, it is demonstrated that the MLP-NN with proper dynamic learning rate is able to find globally optimal solutions. Finally, the efficiency of the MLP-NN in both aspects of speed and power is investigated by some well-known experimental examples. In some of these examples, the proposed method gives explicitly better globally optimal solutions compared to that of the other references and also shows completely satisfactory results in other experiments. 2014 * 295(<-659): Improving neural networks generalization with new constructive and pruning methods This paper presents a new constructive method and pruning approaches to control the design of Multi-Layer Perceptron (MLP) without loss in performance. The proposed methods use a multi-objective approach to guarantee generalization. The constructive approach searches for an optimal solution according to the pareto set shape with increasing number of hidden nodes. The pruning methods are able to simplify the network topology and to identify linear connections between the inputs and outputs of the neural model. Topology information and validation sets are used. 2002 * 301(<-536): Learning multicriteria fuzzy classification method PROAFTN from data In this paper, we present a new methodology for learning parameters of multiple criteria classification method PROAFTN from data. There are numerous representations and techniques available for data mining, for example decision trees, rule bases, artificial neural networks, density estimation, regression and clustering. The PROAFTN method constitutes another approach for data mining. It belongs to the class of supervised learning algorithms and assigns membership degree of the alternatives to the classes. The PROAFTN method requires the elicitation of its parameters for the purpose of classification. Therefore, we need an automatic method that helps us to establish these parameters from the given data with minimum classification errors. Here, we propose variable neighborhood search metaheuristic for getting these parameters. The performances of the newly proposed method were evaluated using 10 cross validation technique. The results are compared with those obtained by other classification methods previously reported on the same data. It appears that the solutions of substantially better quality are obtained with proposed method than with these former ones. Crown Copyright (c) 2005 Published by Elsevier Ltd. All rights reserved. 2007 * 302(<-546): Fuzzy integral-based perceptron for two-class pattern classification problems The single-layer perceptron with single output node is a well-known neural network for two-class classification problems. Furthermore, the sigmoid or logistic function is usually used as the activation function in the output neuron. A critical step is to compute the sum of the products of the connection weights with the corresponding inputs, which indicates the assumption of additivity among individual variables. Unfortunately, because the input variables are not always independent of each other, an assumption of additivity may not be reasonable enough. In this paper, the inner product can be replaced with an aggregation value obtained by a useful fuzzy integral by viewing each of the connection weights as a value of a lambda-fuzzy measure for the corresponding variable. A genetic algorithm is then employed to obtain connection weights by maximizing the number of correctly classified training patterns and minimizing the errors between the actual and desired outputs of individual training patterns. The experimental results further demonstrate that the proposed method outperforms the traditional single-layer perceptron and performs well in comparison with other fuzzy or non-fuzzy classification methods. (c) 2006 Elsevier Inc. All rights reserved. 2007 * 303(<-590): Training of multilayer perceptron neural networks by using cellular genetic algorithms This paper deals with a method for training neural networks by using cellular genetic algorithms (CGA). This method was implemented as software, CGANN-Trainer, which was used to generate binary classifiers for recognition of patterns associated with breast cancer images in a multi-objective optimization problem. The results reached by the CGA with the Wisconsin Breast Cancer Database, and the Wisconsin Diagnostic Breast Cancer Database, were compared with some other methods previously reported using the same databases, proving to be an interesting alternative. 2006 * 304(<-632): Multicriteria fuzzy classification procedure PROCFTN: methodology and medical application In this paper, we introduce a new classification procedure for assigning objects to predefined classes, named PROCFTN. This procedure is based on a fuzzy scoring function for choosing a subset of prototypes, which represent the closest resemblance with an object to be assigned. It then applies the majority-voting rule to assign an object to a class. We also present a medical application of this procedure as an aid to assist the diagnosis of central nervous system tumours. The results are compared with those obtained by other classification methods, reported on the same data set, including decision tree, production rules, neural network, k nearest neighbor, multilayer perceptron and logistic regression. Our results are very encouraging and show that the multicriteria decision analysis approach can be successfully used to help medical diagnosis. Crown Copyright (C) 2003 Published by Elsevier B.V. All rights reserved. 2004 * 315(<-685): ARTIFICIAL NEURAL NETWORKS VERSUS NATURAL NEURAL NETWORKS - A CONNECTIONIST PARADIGM FOR PREFERENCE ASSESSMENT Preference is an essential ingredient in all decision processes. This paper presents a new connectionist paradigm for preference assessment in a general multicriteria decision setting. A general structure of an artificial neural network for representing two specified prototypes of preference structures is discussed. An interactive preference assessment procedure and an autonomous learning algorithm based on a novel scheme of supervised learning are proposed. Operating characteristics of the proposed paradigm are also illustrated through detailed results of numerical simulations. 1994 * 316(<- 52): Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data In industrial process control, there may be multiple performance objectives, depending on salient features of the input-output data. Aiming at this situation, this paper proposes multiple actor-critic structures to obtain the optimal control via input-output data for unknown nonlinear systems. The shunting inhibitory artificial neural network (SIANN) is used to classify the input-output data into one of several categories. Different performance measure functions may be defined for disparate categories. The approximate dynamic programming algorithm, which contains model module, critic network, and action network, is used to establish the optimal control in each category. A recurrent neural network (RNN) model is used to reconstruct the unknown system dynamics using input-output data. NNs are used to approximate the critic and action networks, respectively. It is proven that the model error and the closed unknown system are uniformly ultimately bounded. Simulation results demonstrate the performance of the proposed optimal control scheme for the unknown nonlinear system. 2015 * 321(<-610): Evolutionary multi-objective optimization for simultaneous generation of signal-type and symbol-type representations It has been a controversial issue in the research of cognitive science and artificial intelligence whether signal-type representations (typically connectionist networks) or symbol-type representations (e.g., semantic networks, production systems) should be used. Meanwhile, it has also been recognized that both types of information representations might exist in the human brain. In addition, symbol-type representations are often very helpful in gaining insights into unknown systems. For these reasons, comprehensible symbolic rules need to be extracted from trained neural networks. In this paper, an evolutionary multi-objective algorithm is employed to generate multiple models that facilitate the generation of signal-type and symbol-type representations simultaneously. It is argued that one main difference between signal-type and symbol-type representations lies in the fact that the signal-type representations, are models of a higher complexity (fine representation), whereas symbol-type representations are models of a lower complexity (coarse representation). Thus, by generating models with a spectrum of model complexity, we are able to obtain a population of models of both signal-type and symbol-type quality, although certain post-processing is needed to get a fully symbol-type representation. An illustrative example is given on generating neural networks for the breast cancer diagnosis benchmark problem. 2005 * 324(<-377): A multi-objective artificial immune algorithm for parameter optimization in support vector machine Support vector machine (SVM) is a classification method based on the structured risk minimization principle. Penalize, C; and kernel, sigma parameters of SVM must be carefully selected in establishing an efficient SVM model. These parameters are selected by trial and error or man's experience. Artificial immune system (AIS) can be defined as a soft computing method inspired by theoretical immune system in order to solve science and engineering problems. A multi-objective artificial immune algorithm has been used to optimize the kernel and penalize parameters of SVM in this paper. In training stage of SVM, multiple solutions are found by using multi-objective artificial immune algorithm and then these parameters are evaluated in test stage. The proposed algorithm is applied to fault diagnosis of induction motors and anomaly detection problems and successful results are obtained. (c) 2009 Elsevier B.V. All rights reserved. 2011 * 326(<-687): USING GENETIC ALGORITHMS FOR AN ARTIFICIAL NEURAL-NETWORK MODEL INVERSION Genetic algorithms (GAs) and artificial neural networks (ANNs) are techniques for optimization and learning, respectively, which both have been adopted from nature. Their main advantage over traditional techniques is the relatively better performance when applied to complex relations. GAs and ANNs are both self-learning systems, i.e., they do not require any background knowledge from the creator. In this paper, we describe the performance of a GA that finds hypothetical physical structures of poly(ethylene terephthalate) (PET) yarns corresponding to a certain combination of mechanical and shrinkage properties. This GA uses a validated ANN that has been trained for the complex relation between structure and properties of PET. This technique was tested by comparing the optimal points found by the GA with known experimental data under a variety of multi-criteria conditions. 1993 * 329(<-271): Convergence analysis of sliding mode trajectories in multi-objective neural networks learning The Pareto-optimality concept is used in this paper in order to represent a constrained set of solutions that are able to trade-off the two main objective functions involved in neural networks supervised learning: data-set error and network complexity. The neural network is described as a dynamic system having error and complexity as its state variables and learning is presented as a process of controlling a learning trajectory in the resulting state space. In order to control the trajectories, sliding mode dynamics is imposed to the network. It is shown that arbitrary learning trajectories can be achieved by maintaining the sliding mode gains within their convergence intervals. Formal proofs of convergence conditions are therefore presented. The concept of trajectory learning presented in this paper goes further beyond the selection of a final state in the Pareto set, since it can be reached through different trajectories and states in the trajectory can be assessed individually against an additional objective function. (c) 2012 Elsevier Ltd. All rights reserved. 2012 * 332(<-597): Intelligent interactive multiobjective optimization method and its application to reliability optimization In most practical situations involving reliability optimization, there are several mutually conflicting goals such as maximizing the system reliability and minimizing the cost, weight and volume. This paper develops an effective multiobjective optimization method, the Intelligent Interactive Multiobjective Optimization Method (IIMOM). In IIMOM, the general concept of the model parameter vector is proposed. From a practical point of view, a designer's preference structure model is built using Artificial Neural Networks (ANNs) with the model parameter vector as the input and the preference information articulated by the designer over representative samples from the Pareto frontier as the desired output. Then with the ANN model of the designer's preference structure as the objective, an optimization problem is solved to search for improved solutions and guide the interactive optimization process intelligently. IIMOM is applied to the reliability optimization problem of a multi-stage mixed system with five different value functions simulating the designer in the solution evaluation process. The results illustrate that IIMOM is effective in capturing different kinds of preference structures of the designer, and it provides a complete and effective solution for medium- and small-scale multiobjective optimization problems. 2005 * 336(<-644): Simulation metamodeling through artificial neural networks Simulation metamodeling has been a major research field during the last decade. The main objective has been to provide robust, fast decision support aids to enhance the overall effectiveness of decision-making processes. This paper discusses the importance of simulation metamodeling through artificial neural networks (ANNs), and provides general guidelines for the development of ANN-based simulation metamodels. Such guidelines were successfully applied in the development of two ANNs trained to estimate the manufacturing lead times (MLT) for orders simultaneously processed in a four-machine job shop. The design of intelligent systems such as ANNs may help to avoid some of the drawbacks of traditional computer simulation. Metamodels offer significant advantages regarding time consumption and simplicity to evaluate multi-criteria situations. Their operation is notoriously fast compared to the time required to operate conventional simulation packages. (C) 2003 Elsevier Ltd. All rights reserved. 2003 * 340(<-640): Speeding up backpropagation using multiobjective evolutionary algorithms The use of backpropagation for training artificial neural networks (ANNs) is usually associated with a long training process. The user needs to experiment with a number of network architectures; with larger networks, more computational cost in terms of training time is required. The objective of this letter is to present an optimization algorithm, comprising a multiobjective evolutionary algorithm and a gradient-based local search. In the rest of the letter, this is referred to as the memetic Pareto artificial neural network algorithm for training ANNs. The evolutionary approach is used to train the network and simultaneously optimize its architecture. The result is a set of networks, with each network in the set attempting to optimize both the training error and the architecture. We also present a self-adaptive version with lower computational cost. We show empirically that the proposed method is capable of reducing the training time compared to gradient-based techniques. 2003 * 343(<-675): Pattern classification by linear goal programming and its extensions Pattern classification is one of the main themes in pattern recognition, and has been tackled by several methods such as the statistic one, artificial neural networks, mathematical programming and so on. Among them, the multi-surface method proposed by Mangasarian is very attractive, because it can provide an exact discrimination function even for highly nonlinear problems without any assumption on the data distribution. However, the method often causes many slits on the discrimination curve. In other words, the piecewise linear discrimination curve is sometimes too complex resulting in a poor generalization ability. In this paper, several trials in order to overcome the difficulties of the multi-surface method are suggested. One of them is the utilization of goal programming in which the auxiliary linear programming problem is formulated as a goal programming in order to get as simple discrimination curves as possible. Another one is to apply fuzzy programming by which we can get fuzzy discrimination curves with gray zones. In addition, it will be shown that using the suggested methods, the additional learning can be easily made. These features of the methods make the discrimination more realistic. The effectiveness of the methods is shown on the basis of some applications. 1998 * 344(<-404): Multiple criteria optimization-based data mining methods and applications: a systematic survey Support Vector Machine, an optimization technique, is well known in the data mining community. In fact, many other optimization techniques have been effectively used in dealing with data separation and analysis. For the last 10 years, the author and his colleagues have proposed and extended a series of optimization-based classification models via Multiple Criteria Linear Programming (MCLP) and Multiple Criteria Quadratic Programming (MCQP). These methods are different from statistics, decision tree induction, and neural networks. The purpose of this paper is to review the basic concepts and frameworks of these methods and promote the research interests in the data mining community. According to the evolution of multiple criteria programming, the paper starts with the bases of MCLP. Then, it further discusses penalized MCLP, MCQP, Multiple Criteria Fuzzy Linear Programming (MCFLP), Multi-Class Multiple Criteria Programming (MCMCP), and the kernel-based Multiple Criteria Linear Program, as well as MCLP-based regression. This paper also outlines several applications of Multiple Criteria optimization-based data mining methods, such as Credit Card Risk Analysis, Classification of HIV-1 Mediated Neuronal Dendritic and Synaptic Damage, Network Intrusion Detection, Firm Bankruptcy Prediction, and VIP E-Mail Behavior Analysis. 2010 * 346(<-652): Multi-objective cooperative coevolution of artificial neural networks (multi-objective cooperative networks) In this paper we present a cooperative coevolutive model for the evolution of neural network topology and weights, called MOBNET. MOBNET evolves subcomponents that must be combined in order to form a network, instead of whole networks. The problem of assigning credit to the subcomponents is approached as a multi-objective optimization task. The subcomponents in a cooperative coevolutive model must fulfill different criteria to be useful, these criteria usually conflict with each other. The problem of evaluating the fitness on an individual based on many criteria that must be optimized together can be approached as a multi-criteria optimization problems, so the methods from multi-objective optimization offer the most natural way to solve the problem. In this work we show how using several objectives for every subcomponent and evaluating its fitness as a multi-objective optimization problem, the performance of the model is highly competitive. MOBNET is compared with several standard methods of classification and with other neural network models in solving four real-world problems, and it shows the best overall performance of all classification methods applied. It also produces smaller networks when compared to other models. The basic idea underlying MOBNET is extensible to a more general model of coevolutionary computation, as none of its features are exclusive of neural networks design. There are many applications of cooperative coevolution that could benefit from the multi-objective optimization approach proposed in this paper. (C) 2002 Elsevier Science Ltd. All fights reserved. 2002 * 347(<-362): Learning in the feed-forward random neural network: A critical review The Random Neural Network (RNN) has received, since its inception in 1989, considerable attention and has been successfully used in a number of applications. In this critical review paper we focus on the feed-forward RNN model and its ability to solve classification problems. In particular, we paid special attention to the RNN literature related with learning algorithms that discover the RNN interconnection weights, suggested other potential algorithms that can be used to find the RNN interconnection weights, and compared the RNN model with other neural-network based and non-neural network based classifier models. In review, the extensive literature review and experimentation with the RNN feed-forward model provided us with the necessary guidance to introduce six critical review comments that identify some gaps in the RNN's related literature and suggest directions for future research. (C) 2010 Elsevier B.V. All rights reserved. 2011 * 352(<-662): Evolutionary optimization of RBF networks. One of the main obstacles to the widespread use of artificial neural networks is the difficulty of adequately defining values for their free parameters. This article discusses how Radial Basis Function (RBF) networks can have their parameters defined by genetic algorithms. For such, it presents an overall view of the problems involved and the different approaches used to genetically optimize RBF networks. A new strategy to optimize RBF networks using genetic algorithms is proposed, which includes new representation, crossover operator and the use of a multiobjective optimization criterion. Experiments using a benchmark problem are performed and the results achieved using this model are compared to those achieved by other approaches. 2001 * 356(<-594): Stopping criteria for ensemble of evolutionary artificial neural networks The formation of ensemble of artificial neural networks has attracted attentions of researchers in the machine learning and statistical inference domains. It has been shown that combining different neural networks could improve the generalization ability of the learning machine. One challenge is when to stop the training or evolution of the neural networks to avoid overfitting. In this paper, we show that different early stopping criteria based on (i) the minimum validation fitness of the ensemble, and (ii) the minimum of the average population validation fitness could generalize better than the survival population in the last generation. The proposition was tested on four different ensemble methods: (i) a simple ensemble method, where each individual of the population (created and maintained by the evolutionary process) is used as a committee member, (ii) ensemble with island model as a diversity promotion mechanism, (iii) a recent successful ensemble method namely ensemble with negative correlation learning and (iv) an ensemble formed by applying multi-objective optimization. The experimental results suggested that using minimum validation fitness of the ensemble as an early stopping criterion is beneficial. (C) 2005 Elsevier B.V. All rights reserved. 2005 * 369(<-499): Hybrid multiobjective evolutionary design for artificial neural networks Evolutionary algorithms are a class of stochastic search methods that attempts to emulate the biological process of evolution, incorporating concepts of selection, reproduction, and mutation. In recent years, there has been an increase in the use of evolutionary approaches in the training of artificial neural networks (ANNs). While evolutionary techniques for neural networks have shown to provide superior performance over conventional training approaches, the simultaneous optimization of network performance and architecture will almost always result in a slow training process due to the added algorithmic complexity. In this paper, we present a geometrical measure based on the singular value decomposition (SVD) to estimate the necessary number of neurons to be used in training a single-hidden-layer feedforward neural network (SLFN). In addition, we develop a new hybrid multiobjective evolutionary approach that includes the features of a variable length representation that allow for easy adaptation of neural networks structures, an architectural recombination procedure based on the geometrical measure that adapts the number of necessary hidden neurons and facilitates the exchange of neuronal information between candidate designs, and a microhybrid genetic algorithm (mu HGA) with an adaptive local search intensity scheme for local fine-tuning. In addition, the performances of well-known algorithms as well as the effectiveness and contributions of the proposed approach are analyzed and validated through a variety of data set types. 2008 * 374(<-621): A neural network approach to multiobjective and multilevel programming problems This study aims at utilizing the dynamic behavior of artificial neural networks (ANNs) to solve multiobjective programming (MOP) and multilevel programming (MLP) problems. The traditional and nontraditional approaches to the MLP are first classified into five categories. Then, based on the approach proposed by Hopfield and Tank [1], the optimization problem is converted into a system of nonlinear differential equations through the use of an energy function and Lagrange multipliers. Finally, the procedure is extended to MOP and MLP problems. To solve the resulting differential equations, a steepest descent search technique is used. This proposed nontraditional algorithm is efficient for solving complex problems, and is especially useful for implementation on a large-scale VLSI, in which the MOP and MLP problems can be solved on a real time basis. To illustrate the approach, several numerical examples are solved and compared. (C) 2004 Elsevier Ltd. All rights reserved. 2004 * 377(<-513): Soft computing in engineering design - A review The present paper surveys the application of soft computing (SC) techniques in engineering design. Within this context, fuzzy logic (FL), genetic algorithms (GA) and artificial neural networks (ANN), as well as their fusion are reviewed in order to examine the capability of soft computing methods and techniques to effectively address various hard-to-solve design tasks and issues. Both these tasks and issues are studied in the first part of the paper accompanied by references to some results extracted from a survey performed for in some industrial enterprises. The second part of the paper makes an extensive review of the literature regarding the application of soft computing (SC) techniques in engineering design. Although this review cannot be collectively exhaustive, it may be considered as a valuable guide for researchers who are interested in the domain of engineering design and wish to explore the opportunities offered by fuzzy logic, artificial neural networks and genetic algorithms for further improvement of both the design outcome and the design process itself. An arithmetic method is used in order to evaluate the review results, to locate the research areas where SC has already given considerable results and to reveal new research opportunities. (C) 2007 Elsevier Ltd. All rights reserved. 2008 * 379(<-274): Computational algorithms inspired by biological processes and evolution In recent times computational algorithms inspired by biological processes and evolution are gaining much popularity for solving science and engineering problems. These algorithms are broadly classified into evolutionary computation and swarm intelligence algorithms, which are derived based on the analogy of natural evolution and biological activities. These include genetic algorithms, genetic programming, differential evolution, particle swarm optimization, ant colony optimization, artificial neural networks, etc. The algorithms being random-search techniques, use some heuristics to guide the search towards optimal solution and speed-up the convergence to obtain the global optimal solutions. The bio-inspired methods have several attractive features and advantages compared to conventional optimization solvers. They also facilitate the advantage of simulation and optimization environment simultaneously to solve hard-to-define (in simple expressions), real-world problems. These biologically inspired methods have provided novel ways of problem-solving for practical problems in traffic routing, networking, games, industry, robotics, economics, mechanical, chemical, electrical, civil, water resources and others fields. This article discusses the key features and development of bio-inspired computational algorithms, and their scope for application in science and engineering fields. 2012 * 386(<-683): FUZZY THRESHOLD FUNCTIONS AND APPLICATIONS The set of fuzzy threshold functions is defined to be a fuzzy set over the set of functions. All threshold functions have full memberships in this fuzzy set. Defines and investigates a distance measure between a non-linearly separable function and the set of all threshold functions. Defines an explicit expression for the membership function of a fuzzy threshold function through the use of this distance measure and finds three upper bounds for this measure. Presents a general method to compute the distance, an algorithm to generate the representation automatically, and a procedure to determine the proper weights and thresholds automatically. Presents the relationships among threshold gate networks, artificial neural networks and fuzzy neural networks. The results may have useful applications in logic design, pattern recognition, fuzzy logic, multi-objective fuzzy optimization and related areas. 1995 * 391(<-579): Nonessential objectives within network approaches for MCDM In Gal and Hanne [Eur. J. Oper. Res. 119 (1999) 373] the problem of using several methods to solve a multiple criteria decision making (MCDM) problem with linear objective functions after dropping nonessential objectives is analyzed. It turned out that the solution does not need be the same when using various methods for solving the system containing the nonessential objectives or not. In this paper we consider the application of network approaches for multicriteria decision making such as neural networks and an approach for combining MCDM methods (called MCDM networks). We discuss questions of comparing the results obtained with several methods as applied to the problem with or without nonessential objectives. Especially, we argue for considering redundancies such as nonessential objectives as a native feature in complex information processing. In contrast to previous results on nonessential objectives, the current paper focuses on discrete MCDM problems which are also denoted as multiple attribute decision making (MADM). (c) 2004 Elsevier B.V. All rights reserved. 2006 * 392(<-666): Clustering and selection of multiple criteria alternatives using unsupervised and supervised neural networks There are decision-making problems that involve grouping and selecting a set of alternatives. Traditional decision-making approaches treat different sets of alternatives with the same method of analysis and selection. In this paper, we propose clustering alternatives into different sets so that different methods of analysis, selection, and implementation for each set can be applied. We consider multiple criteria decision-making alternatives where the decision-maker is faced with several conflicting and non-commensurate objectives (or criteria). For example, consider buying a set of computers for a company that vary in terms of their functions, prices, and computing powers. In this paper, we develop theories and procedures for clustering and selecting discrete multiple criteria alternatives. The sets of alternatives clustered are mutually exclusive and are based on (1) similar features among alternatives, and (2) preferential structure of the decision-maker. The decision-making process can be broken down into three steps: (1) generating alternatives; (2) grouping or clustering alternatives based on similarity of their features; and (3) choosing one or more alternatives from each cluster of alternatives. We utilize unsupervised learning clustering artificial neural networks (ANN) with variable weights for clustering of alternatives, and we use feedforward ANN for the selection of the best alternatives for each cluster of alternatives. The decision-maker is interactively involved by comparing and contrasting alternatives within each group so that the best alternative can be selected from each group. For the learning mechanism of ANN, we proposed using a generalized Euclidean distance where by changing its coefficients new formation of clusters of alternatives can be achieved. The algorithm is interactive and the results are independent of the initial set-up information. Some examples and computational results are presented. 2000 * 395(<-215): A novel artificial immune clonal selection classification and rule mining with swarm learning model Metaheuristic optimisation algorithms have become popular choice for solving complex problems. By integrating Artificial Immune clonal selection algorithm (CSA) and particle swarm optimisation (PSO) algorithm, a novel hybrid Clonal Selection Classification and Rule Mining with Swarm Learning Algorithm (CS2) is proposed. The main goal of the approach is to exploit and explore the parallel computation merit of Clonal Selection and the speed and self-organisation merits of Particle Swarm by sharing information between clonal selection population and particle swarm. Hence, we employed the advantages of PSO to improve the mutation mechanism of the artificial immune CSA and to mine classification rules within datasets. Consequently, our proposed algorithm required less training time and memory cells in comparison to other AIS algorithms. In this paper, classification rule mining has been modelled as a miltiobjective optimisation problem with predictive accuracy. The multiobjective approach is intended to allow the PSO algorithm to return an approximation to the accuracy and comprehensibility border, containing solutions that are spread across the border. We compared our proposed algorithm classification accuracy CS2 with five commonly used CSAs, namely: AIRS1, AIRS2, AIRS-Parallel, CLONALG, and CSCA using eight benchmark datasets. We also compared our proposed algorithm classification accuracy CS2 with other five methods, namely: Naive Bayes, SVM, MLP, CART, and RFB. The results show that the proposed algorithm is comparable to the 10 studied algorithms. As a result, the hybridisation, built of CSA and PSO, can develop respective merit, compensate opponent defect, and make search-optimal effect and speed better. 2013 * 433(<-670): Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves It is well understood that binary classifiers have two implicit objective functions (sensitivity and specificity) describing their performance. Traditional methods of classifier training attempt to combine these two objective functions (or two analogous class performance measures) into one so that conventional scalar optimization techniques can be utilized. This involves incorporating a priori information into the aggregation method so that the resulting performance of the classifier is satisfactory for the task at hand. We have investigated the use of a niched Pareto multiobjective genetic algorithm (GA) for classifier optimization. With niched Pareto GA's, an objective vector is optimized instead of a scalar function, eliminating the need to aggregate classification objective functions. The niched Pareto GA returns a set of optimal solutions that are equivalent in the absence of any information regarding the preferences of the objectives. The a priori knowledge that was used for aggregating the objective functions in conventional classifier training can instead be applied post-optimization to select from one of the series of solutions returned from the multiobjective genetic optimization. We have applied this technique to train a linear classifier and an artificial neural network (ANN), using simulated datasets, The performances of the solutions returned from the multiobjective genetic optimization represent a series of optimal (sensitivity, specificity) pairs, which can be thought of as operating points on a receiver operating characteristic (ROC) curve. All possible ROC curves for a given dataset and classifier are less than or equal to the ROC curve generated by the niched Pareto genetic optimization. 1999 * 438(<-226): Algorithm for Increasing the Speed of Evolutionary Optimization and its Accuracy in Multi-objective Problems Optimization algorithms are important tools for the solution of combinatorial management problems. Nowadays, many of those problems are addressed by using evolutionary algorithms (EAs) that move toward a near-optimal solution by repetitive simulations. Sometimes, such extensive simulations are not possible or are costly and time-consuming. Thus, in this study a method based on artificial neural networks (ANN) is proposed to reduce the number of simulations required in EAs. Specifically, an ANN simulator is used to reduce the number of simulations by the main simulator. The ANN is trained and updated only for required areas in the decision space. Performance of the proposed method is examined by integrating it with the non-dominated sorting genetic algorithm (NSGAII) in multi-objective problems. In terms of density and optimality of the Pareto front, the hybrid NSGAII-ANN is able to extract the Pareto front with much less simulation time compared to the sole use of the NSGAII algorithm. The proposed NSGAII-ANN methodology was examined using three standard test problems (FON, KUR, and ZDT1) and one real-world problem. The latter addresses the operation of a reservoir with two objectives (meeting demand and flood control). Thus, based on this study, use of the NSGAII-ANN integrative algorithm in problems with time-consuming simulators reduces the required time for optimization up to 50 times. Results of the real-world problem, despite lower computational-time requirements, show a performance similar to that achieved in the aforementioned test problems. 2013 * 476(<-647): Software verification of redundancy in neuro-evolutionary robotics Evolutionary methods are now commonly used to automatically generate autonomous controllers for physical robots as well as for virtually embodied organisms. Although it is generally accepted that some amount of redundancy may result from using an evolutionary approach, few studies have focused on empirically testing the actual amount of redundancy that is present in controllers generated using artificial evolutionary systems. Network redundancy in certain application domains such as defence, space, and safeguarding, is unacceptable as it puts the reliability of the system at great risk. Thus, our aim in this paper is to test and compare the redundancies of artificial neural network (ANN) controllers that are evolved for a quadrupedal robot using four different evolutionary methodologies. Our results showed that the least amount of redundancy was generated using a self-adaptive Pareto evolutionary multi-objective optimization (EMO) algorithm compared to the more commonly used single-objective evolutionary algorithm (EA) and weighted sum EMO algorithm. Finally, self-adaptation was found to be highly beneficial in reducing redundancy when compared against a hand-tuned Pareto EMO algorithm. 2003 * 502(<-235): COMBINING EVOLUTION STRATEGY WITH ORDINAL OPTIMIZATION In this paper, we combine evolution strategy (ES) with ordinal optimization (OO), abbreviated as ES+OO, to solve real-time combinatorial stochastic simulation optimization problems with huge discrete solution space. The first step of ES+OO is to use an artificial neural network (ANN) to construct a surrogate model to roughly evaluate the objective value of a solution. In the second step, we apply ES assisted by the ANN-based surrogate model to the considered problem to obtain a subset of good enough solutions. In the last step, we use the exact model to evaluate each solution in the good enough subset, and the best one is the final good enough solution. We apply the proposed algorithm to a wafer testing problem, which is formulated as a combinatorial stochastic simulation optimization problem that consists of a huge discrete solution space formed by the vector of threshold values in the testing process. We demonstrate that (a) ES+OO outperforms the combination of genetic algorithm (GA) with OO using extensive simulations in the wafer testing problem, and its computational efficiency is suitable for real-time application, (b) the merit of using OO approach in solving the considered problem and (c) ES+OO can obtain the approximate Pareto optimal solution of the multi-objective function resided in the considered problem. Above all, we propose a systematic procedure to evaluate the performance of ES+OO by providing a quantitative result. 2013 * 505(<-678): Artificial neural network representations for hierarchical preference structures In this paper, we introduce two artificial neural network formulations that can be used to assess the preference ratings from the pairwise comparison matrices of the Analytic Hierarchy Process. First, we introduce a modified Hopfield network that can determine the vector of preference ratings associated with a positive reciprocal comparison matrix. The dynamics of this network are mathematically equivalent to the power method, a widely used numerical method for computing the principal eigenvectors of square matrices. However, this Hopfield network representation is incapable of generalizing the preference patterns, and consequently is not suitable for approximating the preference ratings if the pairwise comparison judgments are imprecise. Second, we present a feed-forward neural network formulation that does have the ability to accurately approximate the preference ratings. We use a simulation experiment to verify the robustness of the feed-forward neural network formulation with respect to imprecise pairwise judgments. From the results of this experiment, we conclude that the feed-forward neural network formulation appears to be a powerful tool for analyzing discrete alternative multicriteria decision problems with imprecise or fuzzy ratio-scale preference judgments. Copyright (C) 1996 Elsevier Science Ltd 1996 * 519(<-366): Multi-objective memetic algorithm: comparing artificial neural networks and pattern search filter method approaches In this work, two methodologies to reduce the computation time of expensive multi-objective optimization problems are compared. These methodologies consist of the hybridization of a multi-objective evolutionary algorithm (MOEA) with local search procedures. First, an inverse artificial neural network proposed previously, consisting of mapping the decision variables into the multiple objectives to be optimized in order to generate improved solutions on certain generations of the MOEA, is presented. Second, a new approach based on a pattern search filter method is proposed in order to perform a local search around certain solutions selected previously from the Pareto frontier. The results obtained, by the application of both methodologies to difficult test problems, indicate a good performance of the approaches proposed. 2011 * 527(<-270): Structure optimization of neural network for dynamic system modeling using multi-objective genetic algorithm The problem of constructing an adequate and parsimonious neural network topology for modeling non-linear dynamic system is studied and investigated. Neural networks have been shown to perform function approximation and represent dynamic systems. The network structures are usually guessed or selected in accordance with the designer's prior knowledge. However, the multiplicity of the model parameters makes it troublesome to get an optimum structure. In this paper, an alternative algorithm based on a multi-objective optimization algorithm is proposed. The developed neural network model should fulfil two criteria or objectives namely good predictive accuracy and minimum model structure. The result shows that the proposed algorithm is able to identify simulated examples correctly, and identifies the adequate model for real process data based on a set of solutions called the Pareto optimal set, from which the best network can be selected. 2012 * 645(<-524): Multi-criteria optimization in nonlinear predictive control The multi-criteria predictive control of nonlinear dynamical systems based on Artificial Neural Networks (ANNs) and genetic algorithms (GAs) are considered. The (ANNs) are used to determine process models at each operating level; the control action is provided by minimizing a set of control objective which is function of the future prediction output and the future control actions in tacking in account constraints in input signal. An aggregative method based on the Non-dominated Sorting Genetic Algorithm (NSGA) is applied to solve the multi-criteria optimization problem. The results obtained with the proposed control scheme are compared in simulation to those obtained with the multi-model control approach. (c) 2007 IMACS. Published by Elsevier B.V. All rights reserved. 2008 * 653(<-664): An artificial neural network approach to multicriteria model selection This paper presents an intelligent decision support system based on neural network technology for multicriteria model selection. This paper categorizes the problem as simple, utility / value, interactive and outranking type of problem according to six basic features. The classification of the problem is realized based on a two-step neural network analysis applying back-propagation algorithm. The first Artificial Neural Network (ANN) model that is used for the selection of an appropriate solving method cluster consists of one hidden layer. The six input neurons of the model represent the MCDM problem features while the two output neurons represent the four MCDM categories. The second ANN model is used for the selection of a specific method within the selected cluster. 2001