-*- mode: org -*-
*   3(<-596): MOP/GP models for machine learning

Techniques for machine learning have been extensively studied in
recent years as effective tools in data mining. Although there have
been several approaches to machine learning, we focus on the
mathematical programming (in particular, multi-objective and goal
programming; MOP/GP) approaches in this paper. Among them, Support
Vector Machine (SVM) is gaining much popularity recently. In pattern
classification problems with two class sets, its idea is to find a
maximal margin separating hyperplane which gives the greatest
separation between the classes in a high dimensional feature
space. This task is performed by solving a quadratic programming
problem in a traditional formulation, and can be reduced to solving a
linear programming in another formulation. However, the idea of
maximal margin separation is not quite new: in the 1960s the
multi-surface method (MSM) was suggested by Mangasarian. In the 1980s,
linear classifiers using goal programming were developed
extensively. This paper presents an overview on how effectively MOP/GP
techniques can be applied to machine learning such as SVM, and
discusses their problems. (c) 2004 Elsevier B.V. All rights reserved.

2005


*   4(<-614): Study on Support Vector Machines Using Mathematical Programming

Machine learning has been extensively studied in recent years as
eective tools inpattern classication problem. Although there have been
several approaches to machinelearning, we focus on the mathematical
programming (in particular, multi-objective andgoal programming;
MOP/GP) approaches in this paper. Among them, Support VectorMachine
(SVM) is gaining much popularity recently. In pattern classication
problemwith two class sets, the idea is to nd a maximal margin
separating hyperplane whichgives the greatest separation between the
classes in a high dimensional feature space.However, the idea of
maximal margin separation is not quite new: in 1960's the
multi-surface method (MSM) was suggested by Mangasarian. In 1980's,
linear classiersusing goal programming were developed
extensively. This paper proposes a new familyof SVM using MOP/GP
techniques, and discusses its eectiveness throughout severalnumerical
experiments.

2005


*   6(<-465): A Multiobjective Genetic SVM Approach for Classification Problems With Limited Training Samples

In this paper, a novel method for semsupervised classification with
limited training samples is presented. Its aim is to exploit unlabeled
data available at zero cost in the image under analysis for improving
the accuracy of a classification process based on support vector
machines (SVMs). It is based on the idea to augment the original set
of training samples with a set of unlabeled samples after estimating
their label. The label estimation process is performed within a
multiobjective genetic optimization framework where each chromosome of
the evolving population encodes the label estimates as well as the SVM
classifier parameters for tackling the model selection issue. Such a
process is guided by the joint minimization of two different criteria
which express the generalization capability of the SVM classifier. The
two explored criteria are an empirical risk measure and an indicator
of the classification model sparseness, respectively. The experimental
results obtained on two multisource remote sensing data sets confirm
the promising capabilities of the proposed approach, which allows the
following: 1) taking a clear advantage in terms of classification
accuracy from unlabeled samples used for inflating the original
training set and 2) solving automatically the tricky model selection
issue.

2009


*   8(<-514): Genetic SVM approach to semisupervised multitemporal classification

The updating of classification maps, as new image acquisitions are
obtained, raises the problem of ground-truth information (training
samples) updating. In this context, semisupervised multitemporal
classification represents an interesting though still not well
consolidated approach to tackle this issue. In this letter, we propose
a novel methodological solution based on this approach. Its underlying
idea is to update the ground-truth information through an automatic
estimation process, which exploits archived ground-truth information
as well as basic indications from the user about allowed/forbidden
class transitions from an acquisition date to another. This updating
problem is formulated by means of the support vector machine
classification approach and a constrained multiobjective optimization
genetic algorithm. Experimental results on a multitemporal data set
consisting of two multisensor (Landsat-5 Thematic Mapper and European
Remote Sensing satellite synthetic aperture radar) images are reported
and discussed.

2008


*  13(<- 90): A hybrid meta-learning architecture for multi-objective optimization of SVM parameters

Support Vector Machines (SVMs) have achieved a considerable attention
due to their theoretical foundations and good empirical performance
when compared to other learning algorithms in different
applications. However, the SVM performance strongly depends on the
adequate calibration of its parameters. In this work we proposed a
hybrid multi-objective architecture which combines meta-learning (ML)
with multi-objective particle swarm optimization algorithms for the
SVM parameter selection problem. Given an input problem, the proposed
architecture uses a ML technique to suggest an initial Pareto front of
SVM configurations based on previous similar learning problems; the
suggested Pareto front is then refined by a multi-objective
optimization algorithm. In this combination, solutions provided by ML
are possibly located in good regions in the search space. Hence, using
a reduced number of successful candidates, the search process would
converge faster and be less expensive. In the performed experiments,
the proposed solution was compared to traditional multi-objective
algorithms with random initialization, obtaining Pareto fronts with
higher quality on a set of 100 classification problems. (C) 2014
Elsevier B.V. All rights reserved.

2014


*  19(<-585): Multiobjective analysis of chaotic dynamic systems with sparse learning machines

Sparse learning machines provide a viable framework for modeling
chaotic time-series systems. A powerful state-space reconstruction
methodology using both support vector machines (SVM) and relevance
vector machines (RVM) within a multiobjective optimization framework
is presented in this paper. The utility and practicality of the
proposed approaches have been demonstrated on the time series of the
Great Salt Lake (GSL) biweekly volumes from 1848 to 2004. A comparison
of the two methods is made based on their predictive power and
robustness. The reconstruction of-the dynamics of the Great Salt Lake
volume time series is attained using the most relevant feature subset
of the training data. In this paper, efforts are also made to assess
the uncertainty and robustness of the machines in learning and
forecasting as a function of model structure, model parameters, and
bootstrapping samples. The resulting model will normally have a
structure, including parameterization, that suits the information
content of the available data, and can be used to develop time series
forecasts for multiple lead times ranging from two weeks to several
months. (c) 2005 Elsevier Ltd. All rights reserved.

2006


*  20(<-157): Leave-one-out cross-validation-based model selection for multi-input multi-output support vector machine

As an effective approach for multi-input multi-output regression
estimation problems, a multi-dimensional support vector regression
(SVR), named M-SVR, is generally capable of obtaining better
predictions than applying a conventional support vector machine (SVM)
independently for each output dimension. However, although there are
many generalization error bounds for conventional SVMs, all of them
cannot be directly applied to M-SVR. In this paper, a new
leave-one-out (LOO) error estimate for M-SVR is derived firstly
through a virtual LOO cross-validation procedure. This LOO error
estimate can be straightway calculated once a training process ended
with less computational complexity than traditional LOO method. Based
on this LOO estimate, a new model selection methods for M-SVR based on
multi-objective optimization strategy is further proposed in this
paper. Experiments on toy noisy function regression and practical
engineering data set, that is, dynamic load identification on cylinder
vibration system, are both conducted, demonstrating comparable results
of the proposed method in terms of generalization performance and
computational cost.

2014


*  23(<-609): Multi-objective model selection for support vector machines

In this article, model selection for support vector machines is viewed
as a multi-objective optimization problem, where model complexity and
training accuracy define two conflicting objectives. Different
optimization criteria are evaluated: Split modified radius margin
bounds, which allow for comparing existing model selection criteria,
and the training error in conjunction with the number of support
vectors for designing sparse solutions.

2005


*  26(<-150): A novel feature selection method for twin support vector machine

Both support vector machine (SVM) and twin support vector machine
(TWSVM) are powerful classification tools. However, in contrast to
many SVM-based feature selection methods, TWSVM has not any
corresponding one due to its different mechanism up to now. In this
paper, we propose a feature selection method based on TWSVM, called
FTSVM. It is interesting because of the advantages of TWSVM in many
cases. Our FTSVM is quite different from the SVM-based feature
selection methods. In fact, linear SVM constructs a single separating
hyperplane which corresponds a single weight for each feature, whereas
linear TWSVM constructs two fitting hyperplanes which corresponds to
two weights for each feature. In our linear FTSVM, in order to link
these two fitting hyperplanes, a feature selection matrix is
introduced. Thus, the feature selection becomes to find an optimal
matrix, leading to solve a multi-objective mixed-integer programming
problem by a greedy algorithm. In addition, the linear FTSVM has been
extended to the nonlinear case. Furthermore, a feature ranking
strategy based on FTSVM is also suggested. The experimental results on
several public available benchmark datasets indicate that our FTSVM
not only gives nice feature selection on both linear and nonlinear
cases but also improves the performance of TWSVM efficiently. (C) 2014
Elsevier B.V. All rights reserved.

2014


*  30(<-  8): Novel approaches using evolutionary computation for sparse least square support vector machines

This paper introduces two new approaches to building sparse least
square support vector machines (LSSVM) based on genetic algorithms
(GAs) for classification tasks. LSSVM classifiers are an alternative
to SVM ones because the training process of LSSVM classifiers only
requires to solve a linear equation system instead of a quadratic
programming optimization problem. However, the absence of sparseness
in the Lagrange multiplier vector (i.e. the solution) is a significant
problem for the effective use of these classifiers. In order to
overcome this lack of sparseness, we propose both single and
multi-objective GA approaches to leave a few support vectors out of
the solution without affecting the classifier's accuracy and even
improving it. The main idea is to leave out outliers, non-relevant
patterns or those ones which can be corrupted with noise and thus
prevent classifiers to achieve higher accuracies along with a reduced
set of support vectors. Differently from previous works, genetic
algorithms are used in this work to obtain sparseness not to find out
the optimal values of the LSSVM hyper-parameters. (C) 2015 Elsevier
B.V. All rights reserved.

2015


*  31(<-586): Additive preference model with piecewise linear components resulting from Dominance-based Rough Set Approximations

Dominance-based Rough Set Approach (DRSA) has been proposed for multi-criteria classification problems in order to handle inconsistencies in the input information with respect to the dominance principle. The end result of DRSA is a decision rule model of Decision Maker preferences. In this paper, we consider an additive function model resulting from dominance-based rough approximations. The presented approach is similar to UTA and UTADTS methods. However, we define a goal function of the optimization problem in a similar way as it is done in Support Vector Machines (SVM). The problem may. also be defined as the one of searching for linear value functions in a transformed feature space obtained by exhaustive binarization of criteria.

2006


*  32(<-120): A niching genetic programming-based multi-objective algorithm for hybrid data classification

This paper introduces a multi-objective algorithm based on genetic
programming to extract classification rules in databases composed of
hybrid data, i.e., regular (e.g. numerical, logical, and textual) and
non-regular (e.g. geographical) attributes. This algorithm employs a
niche technique combined with a population archive in order to
identify the rules that are more suitable for classifying items
amongst classes of a given data set. The algorithm is implemented in
such a way that the user can choose the function set that is more
adequate for a given application. This feature makes the proposed
approach virtually applicable to any kind of data set classification
problem. Besides, the classification problem is modeled as a
multi-objective one, in which the maximization of the accuracy and the
minimization of the classifier complexity are considered as the
objective functions. A set of different classification problems, with
considerably different data sets and domains, has been considered:
wines, patients with hepatitis, incipient faults in power transformers
and level of development of cities. In this last data set, some of the
attributes are geographical, and they are expressed as points, lines
or polygons. The effectiveness of the algorithm has been compared with
three other methods, widely employed for classification: Decision Tree
(C4.5), Support Vector Machine (SVM) and Radial Basis Function
(RBF). Statistical comparisons have been conducted employing one-way
ANOVA and Tukey's tests, in order to provide reliable comparison of
the methods. The results show that the proposed algorithm achieved
better classification effectiveness in all tested instances, what
suggests that it is suitable for a considerable range of
classification applications. (C) 2014 Elsevier B.V. All rights
reserved.

2014


*  37(<- 64): Surrogate-assisted multi-objective model selection for support vector machines

Classification is one of the most well-known tasks in supervised
learning. A vast number of algorithms for pattern classification have
been proposed so far. Among these, support vector machines (SVMs) are
one of the most popular approaches, due to the high performance
reached by these methods in a wide number of pattern recognition
applications. Nevertheless, the effectiveness of SVMs highly depends
on their hyper-parameters. Besides the fine-tuning of their
hyper-parameters, the way in which the features are scaled as well as
the presence of non-relevant features could affect their
generalization performance. This paper introduces an approach for
addressing model selection for support vector machines used in
classification tasks. In our formulation, a model can be composed of
feature selection and pre-processing methods besides the SVM
classifier. We formulate the model selection problem as a
multi-objective one, aiming to minimize simultaneously two components
that are closely related to the error of a model: bias and variance
components, which are estimated in an experimental fashion. A
surrogate-assisted evolutionary multi-objective optimization approach
is adopted to explore the hyper-parameters space. We adopted this
approach due to the fact that estimating the bias and variance could
be computationally expensive. Therefore, by using surrogate-assisted
optimization, we expect to reduce the number of solutions evaluated by
the fitness functions so that the computational cost would also be
reduced. Experimental results conducted on benchmark datasets widely
used in the literature, indicate that highly competitive models with a
fewer number of fitness function evaluations are obtained by our
proposal when it is compared to state of the art model selection
methods. (C) 2014 Elsevier B.V. All rights reserved.

2015


*  40(<-467): AG-ART: An adaptive approach to evolving ART architectures

This paper focuses on classification problems, and in particular on
the evolution of ARTMAP architectures using genetic algorithms, with
the objective of improving generalization performance and alleviating
the adaptive resonance theory (ART) category proliferation problem. In
a previous effort, we introduced evolutionary fuzzy ARTMAP (FAM),
referred to as genetic Fuzzy ARTMAP (GFAM). In this paper we apply an
improved genetic algorithm to FAM and extend these ideas to two other
ART architectures; ellipsoidal ARTMAP (EAM) and Gaussian ARTMAP
(CAM). One of the major advantages of the proposed improved genetic
algorithm is that it adapts the CA parameters automatically, and in a
way that takes into consideration the intricacies of the
classification problem under consideration. The resulting genetically
engineered ART architectures are justifiably referred to as AG-FAM,
AG-EAM and AG-GAM or collectively as AG-ART (adaptive genetically
engineered ART). We compare the performance (in terms of accuracy,
size, and computational cost) of the AG-ART architectures with GFAM,
and other ART architectures that have appeared in the literature and
attempted to solve the category proliferation problem. Our results
demonstrate that AG-ART architectures exhibit better performance than
their other ART counterparts (semi-supervised ART) and better
performance than GFAM. We also compare AG-ART's performance to other
related results published in the classification literature, and
demonstrate that AG-ART architectures exhibit competitive
generalization performance and, quite often, produce smaller size
classifiers in solving the same classification problems. We also show
that AG-ART's performance gains are achieved within a reasonable
computational budget. (C) 2008 Elsevier B.V. All rights reserved.

2009


*  41(<-580): Multi-objective parameters selection for SVM classification using NSGA-II

Selecting proper parameters is an important issue to extend the
classification ability of Support Vector Machine (SVM), which makes
SVM practically useful. Genetic Algorithm (CA) has been widely applied
to solve the problem of parameters selection for SVM classification
due to its ability to discover good solutions quickly for complex
searching and optimization problems. However, traditional CA in this
field relys on single generalization error bound as fitness function
to select parameters. Since there have several generalization error
bounds been developed, picking and using single criterion as fitness
function seems intractable and insufficient. Motivated by the
multi-objective optimization problems, this paper introduces an
efficient method of parameters selection for SVM classification based
on multi-objective evolutionary algorithm NSCA-II. We also introduce
an adaptive mutation rate for NSGA-II. Experiment results show that
our method is better than single-objective approaches, especially in
the case of tiny training sets with large testing sets.

2006


*  42(<-589): Multiobjective optimization of ensembles of multilayer perceptrons for pattern classification

Pattern classification seeks to minimize error of unknown patterns,
however, in many real world applications, type I (false positive) and
type II (false negative) errors have to be dealt with separately,
which is a complex problem since an attempt to minimize one of them
usually makes the other grow. Actually, a type of error can be more
important than the other, and a trade-off that minimizes the most
important error type must be reached. Despite the importance of
type-II errors, most pattern classification methods take into account
only the global classification error. In this paper we propose to
optimize both error types in classification by means of a
multiobjective algorithm in which each error type and the network size
is an objective of the fitness function. A modified version of the
GProp method (optimization and design of multilayer perceptrons) is
used, to simultaneously optimize the network size and the type I and
II errors.

2006


*  43(<-238): Multiplicative Update Rules for Concurrent Nonnegative Matrix Factorization and Maximum Margin Classification

The state-of-the-art classification methods which employ nonnegative
matrix factorization (NMF) employ two consecutive independent
steps. The first one performs data transformation (dimensionality
reduction) and the second one classifies the transformed data using
classification methods, such as nearest neighbor/centroid or support
vector machines (SVMs). In the following, we focus on using NMF
factorization followed by SVM classification. Typically, the
parameters of these two steps, e. g., the NMF bases/coefficients and
the support vectors, are optimized independently, thus leading to
suboptimal classification performance. In this paper, we merge these
two steps into one by incorporating maximum margin classification
constraints into the standard NMF optimization. The notion behind the
proposed framework is to perform NMF, while ensuring that the margin
between the projected data of the two classes is maximal. The
concurrent NMF factorization and support vector optimization are
performed through a set of multiplicative update rules. In the same
context, the maximum margin classification constraints are imposed on
the NMF problem with additional discriminant constraints and
respective multiplicative update rules are extracted. The impact of
the maximum margin classification constraints on the NMF factorization
problem is addressed in Section VI. Experimental results in several
databases indicate that the incorporation of the maximum margin
classification constraints into the NMF and discriminant NMF objective
functions improves the accuracy of the classification.

2013


*  44(<-425): A multi-model selection framework for unknown and/or evolutive misclassification cost problems

In this paper, we tackle the problem of model selection when
misclassification costs are unknown and/or may evolve. Unlike
traditional approaches based on a scalar optimization, we propose a
generic multimodel selection framework based on a multi-objective
approach. The idea is to automatically train a pool of classifiers
instead of one single classifier, each classifier in the pool
optimizing a particular trade-off between the objectives. Within the
context of two-class classification problems, we introduce the "ROC
front concept" as an alternative to the ROC curve representation. This
strategy is applied to the multimodel selection of SVM classifiers
using an evolutionary multi-objective optimization algorithm. The
comparison with a traditional scalar optimization technique based on
an AUC criterion shows promising results on UCl datasets as well as on
a real-world classification problem. (C) 2009 Elsevier Ltd. All rights
reserved.

2010


*  45(<-429): NONCOST SENSITIVE SVM TRAINING USING MULTIPLE MODEL SELECTION

In this paper, we propose a multi-objective optimization framework for
SVM hyperparameters tuning. The key idea is to manage a population of
classifiers optimizing both False Positive and True Positive rates
rather than a single classifier optimizing a scalar criterion. Hence,
each classifier in the population optimizes a particular trade-off
between the objectives. Within the context of two-class classification
problems, our work introduces "the receiver operating characteristics
(ROC) front concept" depicting a population of SVM classifiers as an
alternative to the receiver operating characteristics (ROC) curve
representation. The proposed framework leads to a noncost sensitive
SVM training relying on the pool of classifiers. The comparison with a
traditional scalar optimization technique based on an AUC criterion
shows promising results on UCI datasets.

2010


*  46(<-567): Two-group classification via a biobjective margin maximization model

In this paper we propose a biobjective model for two-group
classification via margin maximization, in which the margins in both
classes are simultaneously maximized. The set of Pareto-optimal
solutions is described, yielding a set of parallel hyperplanes, one of
which is just the solution of the classical SVM approach. In order to
take into account different misclassification costs or a priori
probabilities, the ROC curve can be used to select one out of such
hyperplanes by expressing the adequate tradeoff for sensitivity and
specificity. Our result gives a theoretical motivation for using the
ROC approach in case misclassification costs in the two groups are not
necessarily equal. (c) 2005 Elsevier B.V. All rights reserved.

2006


*  47(<-573): Multi-class ROC analysis from a multi-objective optimisation perspective

The receiver operating characteristic (ROC) has become a standard tool
for the analysis and comparison of classifiers when the costs of
misclassification are unknown. There has been relatively little work,
however, examining ROC for more than two classes. Here we discuss and
present an extension to the standard two-class ROC for multi-class
problems. We define the ROC surface for the Q-class problem in terms
of a multi-objective optimisation problem in which the goal is to
simultaneously minimise the Q(Q-1) misclassification rates, when the
misclassification costs and parameters governing the classifier's
behaviour are unknown. We present an evolutionary algorithm to locate
the Pareto front-the optimal trade-off surface between
misclassifications of different types. The use of the Pareto optimal
surface to compare classifiers is discussed and we present a
straightforward multi-class analogue of the Gini coefficient. The
performance of the evolutionary algorithm is illustrated on a
synthetic three class problem, for both k-nearest neighbour and
multi-layer perceptron classifiers. (c) 2005 Elsevier B.V. All rights
reserved.

2006


*  48(<- 26): Joint model for feature selection and parameter optimization coupled with classifier ensemble in chemical mention recognition

Mention recognition in chemical texts plays an important role in a
wide-spread range of application areas. Feature selection and
parameter optimization are the two important issues in machine
learning. While the former improves the quality of a classifier by
removing the redundant and irrelevant features, the later concerns
finding the most suitable parameter values, which have significant
impact on the overall classification performance. In this paper we
formulate a joint model that performs feature selection and parameter
optimization simultaneously, and propose two approaches based on the
concepts of single and multiobjective optimization
techniques. Classifier ensemble techniques are also employed to
improve the performance further. We identify and implement variety of
features that are mostly domain-independent. Experiments are performed
with various configurations on the benchmark patent and Medline
datasets. Evaluation shows encouraging performance in all the
settings. (C) 2015 Elsevier B.V. All rights reserved.

2015


*  53(<- 44): The influence of scaling metabolomics data on model classification accuracy

Correctly measured classification accuracy is an important aspect not
only to classify pre-designated classes such as disease versus control
properly, but also to ensure that the biological question can be
answered competently. We recognised that there has been minimal
investigation of pre-treatment methods and its influence on
classification accuracy within the metabolomics literature. The
standard approach to pre-treatment prior to classification modelling
often incorporates the use of methods such as autoscaling, which
positions all variables on a comparable scale thus allowing one to
achieve separation of two or more groups (target classes). This is
often undertaken without any prior investigation into the influence of
the pre-treatment method on the data and supervised learning
techniques employed. Whilst this is useful for deriving essential
information such as predictive ability or visual interpretation in
many cases, as shown in this study the standard approach is not always
the most suitable option available. Here, a study has been conducted
to investigate the influence of six pre-treatment methods-autoscaling,
range, level, Pareto and vast scaling, as well as no scaling-on four
classification models, including: principal components-discriminant
function analysis (PC-DFA), support vector machines (SVM), random
forests (RF) and k-nearest neighbours (kNN)-using three publically
available metabolomics data sets. We have demonstrated that
undertaking different pre-treatment methods can greatly affect the
interpretation of the statistical modelling outputs. The results have
shown that data pre-treatment is context dependent and that there was
no single superior method for all the data sets used. Whilst we did
find that vast scaling produced the most robust models in terms of
classification rate for PC-DFA of both NMR spectroscopy data sets, in
general we conclude that both vast scaling and autoscaling produced
similar and superior results in comparison to the other four
pre-treatment methods on both NMR and GC-MS data sets. It is therefore
our recommendation that vast scaling is the primary pre-treatment
method to use as this method appears to be more stable and robust
across all the different classifiers that were conducted in this
study.

2015


*  54(<-127): SVM classification for imbalanced data sets using a multiobjective optimization framework

Classification of imbalanced data sets in which negative instances
outnumber the positive instances is a significant challenge. These
data sets are commonly encountered in real-life problems. However,
performance of well-known classifiers is limited in such
cases. Various solution approaches have been proposed for the class
imbalance problem using either data-level or algorithm-level
modifications. Support Vector Machines (SVMs) that have a solid
theoretical background also encounter a dramatic decrease in
performance when the data distribution is imbalanced. In this study,
we propose an L-1-norm SVM approach that is based on a three objective
optimization problem so as to incorporate into the formulation the
error sums for the two classes independently. Motivated by the
inherent multi objective nature of the SVMs, the solution approach
utilizes a reduction into two criteria formulations and investigates
the efficient frontier systematically. The results indicate that a
comprehensive treatment of distinct positive and negative error levels
may lead to performance improvements that have varying degrees of
increased computational effort.

2014


*  66(<-498): Accurate and resource-aware classification based on measurement data

In this paper, we face the problem of designing accurate
decision-making modules in measurement systems that need to be
implemented on resource-constrained platforms. We propose a
methodology based on multiobjective optimization and genetic
algorithms (GAs) for the analysis of support vector machine (SVM)
solutions in the classification error-complexity space. Specific
criteria for the choice of optimal SVM classifiers and experimental
results on both real and synthetic data will also be discussed.

2008


*  69(<-369): Classification as Clustering: A Pareto Cooperative-Competitive GP Approach

Intuitively population based algorithms such as genetic programming
provide a natural environment for supporting solutions that learn to
decompose the overall task between multiple individuals, or a
team. This work presents a framework for evolving teams without
recourse to prespecifying the number of cooperating individuals. To do
so, each individual evolves a mapping to a distribution of outcomes
that, following clustering, establishes the parameterization of a
(Gaussian) local membership function. This gives individuals the
opportunity to represent subsets of tasks, where the overall task is
that of classification under the supervised learning domain. Thus,
rather than each team member representing an entire class, individuals
are free to identify unique subsets of the overall classification
task. The framework is supported by techniques from evolutionary
multiobjective optimization (EMO) and Pareto competitive
coevolution. EMO establishes the basis for encouraging individuals to
provide accurate yet nonoverlaping behaviors; whereas competitive
coevolution provides the mechanism for scaling to potentially large
unbalanced datasets. Benchmarking is performed against recent examples
of nonlinear SVM classifiers over 12 UCI datasets with between 150 and
200,000 training instances. Solutions from the proposed coevolutionary
multiobjective GP framework appear to provide a good balance between
classification performance and model complexity, especially as the
dataset instance count increases.

2011


*  71(<-411): Integrating Clustering and Supervised Learning for Categorical Data Analysis

The problem of fuzzy clustering of categorical data, where no natural
ordering among the elements of a categorical attribute domain can be
found, is an important problem in exploratory data analysis. As a
result, a few clustering algorithms with focus on categorical data
have been proposed. In this paper, a modified differential evolution
(DE)-based fuzzy c-medoids (FCMdd) clustering of categorical data has
been proposed. The algorithm combines both local as well as global
information with adaptive weighting. The performance of the proposed
method has been compared with those using genetic algorithm, simulated
annealing, and the classical DE technique, besides the FCMdd, fuzzy
k-modes, and average linkage hierarchical clustering algorithm for
four artificial and four real life categorical data sets. Statistical
test has been carried out to establish the statistical significance of
the proposed method. To improve the result further, the clustering
method is integrated with a support vector machine (SVM), a well-known
technique for supervised learning. A fraction of the data points
selected from different clusters based on their proximity to the
respective medoids is used for training the SVM. The clustering
assignments of the remaining points are thereafter determined using
the trained classifier. The superiority of the integrated clustering
and supervised learning approach has been demonstrated.

2010


*  77(<- 79): Pareto-Path Multitask Multiple Kernel Learning

A traditional and intuitively appealing Multitask Multiple Kernel
Learning (MT-MKL) method is to optimize the sum (thus, the average) of
objective functions with (partially) shared kernel function, which
allows information sharing among the tasks. We point out that the
obtained solution corresponds to a single point on the Pareto Front
(PF) of a multiobjective optimization problem, which considers the
concurrent optimization of all task objectives involved in the
Multitask Learning (MTL) problem. Motivated by this last observation
and arguing that the former approach is heuristic, we propose a novel
support vector machine MT-MKL framework that considers an implicitly
defined set of conic combinations of task objectives. We show that
solving our framework produces solutions along a path on the
aforementioned PF and that it subsumes the optimization of the average
of objective functions as a special case. Using the algorithms we
derived, we demonstrate through a series of experimental results that
the framework is capable of achieving a better classification
performance, when compared with other similar MTL approaches.

2015


*  82(<-330): Integrating multicriteria PROMETHEE II method into a single-layer perceptron for two-class pattern classification

PROMETHEE methods based on the outranking relation theory are
extensively used in multicriteria decision aid. A preference index
representing the intensity of preference for one pattern over another
pattern can be measured by various preference functions. The higher
the intensity, the stronger the preference is indicated. In contrast
to traditional single-layer perceptrons (SLPs) with the sigmoid
function, this paper develops a novel PROMETHEE II-based SLP using
concepts from the PROMETHEE II method involving pairwise comparisons
between patterns. The assignment of a class label to a pattern is
dependent on its net preference index, which the proposed perceptron
obtains. Specially, this study designs a genetic-algorithm-based
learning algorithm to determine the relative weights of respective
criteria in order to derive the preference index for any pair of
patterns. Computer simulations involving several real-world data sets
reveal the classification performance of the proposed PROMETHEE
II-based SLP. The proposed perceptron performs well compared to the
other well-known fuzzy or non-fuzzy classification methods.

2011


*  84(<-399): A single-layer perceptron with PROMETHEE methods using novel preference indices

The Preference Ranking Organization METHods for Enrichment Evaluations
(PROMETHEE) methods, based on the outranking relation theory, are used
extensively in multi-criteria decision aid (MCDA). In particular,
preference indices with weighted average aggregation representing the
intensity of preference for one pattern over another pattern are
measured by various preference functions. The higher the intensity,
the stronger the preference is indicated. For MCDA, to obtain the
ranking of alternatives, compromise operators such as the weighted
average aggregation, or the disjunctive operators are often employed
to aggregate the performance values of criteria. The compromise
operators express the group utility or the majority rule, whereas the
disjunctive operators take into account the strongly opponent or
agreeable minorities. Since these two types of operators have their
own unique features, it is interesting to develop a novel aggregator
by integrating them into a single aggregator for a preference
index. This study aims to develop a novel PROMETHEE-based single-layer
perceptron (PROSLP) for pattern classification using the proposed
preference index. The assignment of a class label to a pattern is
dependent on its net preference index, which is obtained by the
proposed perceptron. Computer simulations involving several real-world
data sets reveal the classification performance of the proposed
PROMETHEE-based SLP. The proposed perceptron with the novel preference
index performs well compared to that with the original one. (C) 2010
Elsevier B.V. All rights reserved.

2010


* 117(<- 73): A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications

Particle swarm optimization (PSO) is a heuristic global optimization
method, proposed originally by Kennedy and Eberhart in 1995. It is now
one of the most commonly used optimization techniques. This survey
presented a comprehensive investigation of PSO. On one hand, we
provided advances with PSO, including its modifications (including
quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO),
population topology (as fully connected, von Neumann, ring, star,
random, etc.), hybridization (with genetic algorithm, simulated
annealing, Tabu search, artificial immune system, ant colony
algorithm, artificial bee colony, differential evolution, harmonic
search, and biogeography-based optimization), extensions (to
multiobjective, constrained, discrete, and binary optimization),
theoretical analysis (parameter selection and tuning, and convergence
analysis), and parallel implementation (in multicore, multiprocessor,
GPU, and cloud computing forms). On the other hand, we offered a
survey on applications of PSO to the following eight fields:
electrical and electronic engineering, automation control systems,
communication theory, operations research, mechanical engineering,
fuel and energy, medicine, chemistry, and biology. It is hoped that
this survey would be beneficial for the researchers studying PSO
algorithms.

2015


* 133(<-135): Nonadditive similarity-based single-layer perceptron for multi-criteria collaborative filtering

The main aim of the popular collaborative filtering approaches for
recommender systems is to recommend items that users with similar
preferences have liked in the past. Although single-criterion
recommender systems have been successfully used in several
applications, multi-criteria rating systems that allow users to
specify ratings for various content attributes for individual items
are gaining in importance. To measure the overall similarity between
any two users for multi-criteria collaborative filtering, the
indifference relation in outranking relation theory, which can justify
discrimination between any two patterns, is suitable for
multi-criteria decision making (MCDM). However, nonadditive
indifference indices that address interactions among criteria should
be taken into account. This paper proposes a novel similarity-based
perceptron using nonadditive indifference indices to estimate an
overall rating that a user would give to a specific item. The
applicability of the proposed model to recommendation of initiators on
a group-buying website was examined. Experimental results demonstrate
that the proposed model performs well in terms of generalization
ability compared to other multi-criteria collaborative filtering
approaches. (C) 2013 Elsevier B.V. All rights reserved.

2014


* 137(<-275): A two-stage evolutionary algorithm based on sensitivity and accuracy for multi-class problems

The machine learning community has traditionally used correct
classification rates or accuracy (C) values to measure classifier
performance and has generally avoided presenting classification levels
of each class in the results, especially for problems with more than
two classes. C values alone are insufficient because they cannot
capture the myriad of contributing factors that differentiate the
performance of two different classifiers. Receiver Operating
Characteristic (ROC) analysis is an alternative to solve these
difficulties, but it can only be used for two-class problems. For this
reason, this paper proposes a new approach for analysing classifiers
based on two measures: C and sensitivity (S) (i.e., the minimum of
accuracies obtained for each class). These measures are optimised
through a two-stage evolutionary process. It was conducted by applying
two sequential fitness functions in the evolutionary process,
including entropy (E) for the first stage and a new fitness function,
area (A), for the second stage. By using these fitness functions, the
C level was optimised in the first stage, and the S value of the
classifier was generally improved without significantly reducing C in
the second stage. This two-stage approach improved S values in the
generalisation set (whereas an evolutionary algorithm (EA) based only
on the S measure obtains worse S levels) and obtained both high C
values and good classification levels for each class. The methodology
was applied to solve 16 benchmark classification problems and two
complex real-world problems in analytical chemistry and predictive
microbiology. It obtained promising results when compared to other
competitive multiclass classification algorithms and a multi-objective
alternative based on E and S. (C) 2012 Elsevier Inc. All rights
reserved.

2012


* 141(<-398): Brain-Computer Evolutionary Multiobjective Optimization: A Genetic Algorithm Adapting to the Decision Maker

The centrality of the decision maker (DM) is widely recognized in the
multiple criteria decision-making community. This translates into
emphasis on seamless human-computer interaction, and adaptation of the
solution technique to the knowledge which is progressively acquired
from the DM. This paper adopts the methodology of reactive search
optimization (RSO) for evolutionary interactive multiobjective
optimization. RSO follows to the paradigm of "learning while
optimizing," through the use of online machine learning techniques as
an integral part of a self-tuning optimization scheme. User judgments
of couples of solutions are used to build robust incremental models of
the user utility function, with the objective to reduce the cognitive
burden required from the DM to identify a satisficing solution. The
technique of support vector ranking is used together with a k-fold
cross-validation procedure to select the best kernel for the problem
at hand, during the utility function training procedure. Experimental
results are presented for a series of benchmark problems.

2010


* 144(<-  7): Multiple criteria decision aiding for finance: An updated bibliographic survey

Finance is a popular field for applied and methodological research
involving multiple criteria decision aiding (MCDA) techniques. In this
study we present an up-to-date bibliographic survey of the
contributions of MCDA in financial decision making, focusing on the
developments during the past decade. The survey covers all main areas
of financial modeling as well as the different methodological
approaches in MCDA and its connections with other analytical
fields. On the basis of the survey results, we discuss the
contributions of MCDA in different areas of financial decision making
and identify established and emerging research topics, as well as
future opportunities and challenges. (C) 2015 Elsevier B.V. and
Association of European Operational Research Societies (EURO) within
the International Federation of Operational Research Societies
(IFORS). All rights reserved.

2015


* 146(<-365): Preference disaggregation and statistical learning for multicriteria decision support: A review

Disaggregation methods have become popular in multicriteria decision
aiding (MCDA) for eliciting preferential information and constructing
decision models from decision examples. From a statistical point of
view, data mining and machine learning are also involved with similar
problems, mainly with regard to identifying patterns and extracting
knowledge from data. Recent research has also focused on the
introduction of specific domain knowledge in machine learning
algorithms. Thus, the connections between disaggregation methods in
MCDA and traditional machine learning tools are becoming stronger. In
this paper the relationships between the two fields are explored. The
differences and similarities between the two approaches are
identified, and a review is given regarding the integration of the two
fields. (C) 2010 Elsevier B.V. All rights reserved.

2011


* 149(<-478): A memetic model of evolutionary PSO for computational finance applications

Motivated by the compensatory property of EA and PSO, where the latter
can enhance solutions generated from the evolutionary operations by
exploiting their individual memory and social knowledge of the swarm,
this paper examines the implementation of PSO as a local optimizer for
fine tuning in evolutionary search. The proposed approach is evaluated
on applications from the field of computational finance, namely
portfolio optimization and time series forecasting. Exploiting the
structural similarity between these two problems and the non-linear
fractional knapsack problem. an instance of the latter is generalized
and implemented as the preliminary test platform for the proposed
EA-PSO hybrid model. The experimental results demonstrate the positive
effects of this memetic synergy and reveal general design guidelines
for the implementation of PSO as a local optimizer. Algorithmic
performance improvements are similarly evident when extending to the
real-world optimization problems under the appropriate integration of
PSO with EA. (C) 2008 Elsevier Ltd. All rights reserved.

2009


* 155(<-680): PERCEPTRONS PLAY THE REPEATED PRISONERS-DILEMMA

We examine the implications of bounded rationality in repeated games
by modeling the repeated game strategies as perceptrons
(F. Rosenblatt, ''Principles of Neurodynamics,'' Spartan Books, and
M. Minsky and S. A. Papert, ''Perceptions: An Introduction to
Computational Geometry,'' MIT Press, Cambridge, MA, 1988). In the
prisoner's dilemma game, if the cooperation outcome is Pareto
efficient, then we can establish the folk theorem by perceptrons with
single associative units (Minsky and Papert), whose computational
capability barely exceeds what we would expect from players capable of
fictitious plays (e.g., L. Shapley, Some topics in two-person games,
Adv. Game Theory 5 (1964), 1-28). (C) 1995 Academic Press, Inc.

1995


* 156(<-206): Genetic Algorithms, a Nature-Inspired Tool: A Survey of Applications in Materials Science and Related Fields: Part II

Genetic algorithms (GAs) are a helpful tool in optimization,
simulation, modelling, design, and prediction purposes in various
domains of science including materials science, medicine, technology,
economy, industry, environment protection, etc. Reported uses of GAs
led to solving of numerous complex computational tasks. In materials
science and related fields of science and technology, GAs are
routinely used for materials modeling and design, for optimization of
material properties, the method is also useful in organizing the
material or device production at the industrial scale. Here, the most
recent (years 2008-2012) applications of GAs in materials science and
in related fields (solid state physics and chemistry, crystallography,
production, and engineering) are reviewed. The representative examples
selected from recent literature show how broad is the usefulness of
this computational method.

2013


* 161(<-625): Developing sorting models using preference disaggregation analysis: An experimental investigation

Within the field of multicriteria decision aid, sorting refers to the
assignment of a set of alternatives into predefined homogenous groups
defined in an ordinal way. The real-world applications of this type of
problem extend to a wide range of decision-making fields. Preference
disaggregation analysis provides the framework for developing sorting
models through the analysis of the global judgment of the
decision-maker using mathematical programming techniques. However, the
automatic elicitation of preferential information through the
preference disaggregation analysis raises several issues regarding the
impact of the parameters involved in the model development process on
the performance and the stability of the developed models. The
objective of this paper is to shed light on this issue. For this
purpose the UTADIS preference disaggregation sorting method (UTilites
Additives DIScriminantes) is considered. The conducted analysis is
based on an extensive Monte Carlo simulation and useful findings are
obtained on the aforementioned issues. (C) 2003 Elsevier B.V. All
rights reserved.

2004


* 163(<- 88): Pareto Front Estimation for Decision Making

The set of available multi-objective optimisation algorithms continues
to grow. This fact can be partially attributed to their widespread use
and applicability. However, this increase also suggests several issues
remain to be addressed satisfactorily. One such issue is the diversity
and the number of solutions available to the decision maker (DM). Even
for algorithms very well suited for a particular problem, it is
difficult-mainly due to the computational cost-to use a population
large enough to ensure the likelihood of obtaining a solution close to
the DM's preferences. In this paper we present a novel methodology
that produces additional Pareto optimal solutions from a Pareto
optimal set obtained at the end run of any multi-objective
optimisation algorithm for two-objective and three-objective problem
instances.

2014


* 164(<-306): Memetic algorithms and memetic computing optimization: A literature review

Memetic computing is a subject in computer science which considers
complex structures such as the combination of simple agents and memes,
whose evolutionary interactions lead to intelligent complexes capable
of problem-solving. The founding cornerstone of this subject has been
the concept of memetic algorithms, that is a class of optimization
algorithms whose structure is characterized by an evolutionary
framework and a list of local search components. This article presents
a broad literature review on this subject focused on optimization
problems. Several classes of optimization problems, such as discrete,
continuous, constrained, multi-objective and characterized by
uncertainties, are addressed by indicating the memetic "recipes"
proposed in the literature. In addition, this article focuses on
implementation aspects and especially the coordination of memes which
is the most important and characterizing aspect of a memetic
structure. Finally, some considerations about future trends in the
subject are given. (C) 2011 Elsevier B.V. All rights reserved.

2012


* 165(<-511): Pareto-based multiobjective machine learning: An overview and case studies

Machine learning is inherently a multiobjective task. Traditionally,
however, either only one of the objectives is adopted as the cost
function or multiple objectives are aggregated to a scalar cost
function. This can be mainly attributed to the fact that most
conventional learning algorithms can only deal with a scalar cost
function. Over the last decade, efforts on solving machine learning
problems using the Pareto-based multiobjective optimization
methodology have gained increasing impetus, particularly due to the
great success of multiobjective optimization using evolutionary
algorithms and other population-based stochastic search methods. It
has been shown that Pareto-based multiobjective learning approaches
are more powerful compared to learning algorithms with a scalar cost
function in addressing various topics of machine learning, such as
clustering, feature selection, improvement of generalization ability,
knowledge extraction, and ensemble generation. One common benefit of
the different multiobjective learning approaches is that a deeper
insight into the learning problem can be gained by analyzing the
Pareto front composed of multiple Pareto-optimal solutions. This paper
provides an overview of the existing research on multiobjective
machine learning, focusing on supervised learning. In addition, a
number of case studies are provided to illustrate the major benefits
of the Pareto-based approach to machine learning, e.g., how to
identify interpretable models and models that can generalize on unseen
data from the obtained Pareto-optimal solutions. Three approaches to
Pareto-based multiobjective ensemble generation are compared and
discussed in detail. Finally, potentially interesting topics in
multiobjective machine learning are suggested.

2008


* 167(<-126): Parameter identification and calibration of the Xin'anjiang model using the surrogate modeling approach

Practical experience has demonstrated that single objective functions,
no matter how carefully chosen, prove to be inadequate in providing
proper measurements for all of the characteristics of the observed
data. One strategy to circumvent this problem is to define multiple
fitting criteria that measure different aspects of system behavior,
and to use multi-criteria optimization to identify non-dominated
optimal solutions. Unfortunately, these analyses require running
original simulation models thousands of times. As such, they demand
prohibitively large computational budgets. As a result, surrogate
models have been used in combination with a variety of multi-objective
optimization algorithms to approximate the true Pareto-front within
limited evaluations for the original model. In this study,
multi-objective optimization based on surrogate modeling (multivariate
adaptive regression splines, MARS) for a conceptual rainfall-runoff
model (Xin'anjiang model, XAJ) was proposed. Taking the Yanduhe basin
of Three Gorges in the upper stream of the Yangtze River in China as a
case study, three evaluation criteria were selected to quantify the
goodness-of-fit of observations against calculated values from the
simulation model. The three criteria chosen were the Nash-Sutcliffe
efficiency coefficient, the relative error of peak flow, and runoff
volume (REPF and RERV). The efficacy of this method is demonstrated on
the calibration of the XAJ model. Compared to the single objective
optimization results, it was indicated that the multi-objective
optimization method can infer the most probable parameter set. The
results also demonstrate that the use of surrogate-modeling enables
optimization that is much more efficient; and the total computational
cost is reduced by about 92.5%, compared to optimization without using
surrogate modeling. The results obtained with the proposed method
support the feasibility of applying parameter optimization to
computationally intensive simulation models, via reducing the number
of simulation runs required in the numerical model considerably.

2014


* 169(<-268): Multiresponse Metamodeling in Simulation-Based Design Applications

The optimal design of complex systems in engineering requires the
availability of mathematical models of system's behavior as a function
of a set of design variables; such models allow the designer to search
for the best solution to the design problem. However, system models
(e.g., computational fluid dynamics (CFD) analysis, physical
prototypes) are usually time-consuming and expensive to evaluate, and
thus unsuited for systematic use during design. Approximate models of
system behavior based on limited data, also known as metamodels, allow
significant savings by reducing the resources devoted to modeling
during the design process. In this work in engineering design based on
multiple performance criteria, we propose the use of multi-response
Bayesian surrogate models (MR-BSM) to model several aspects of system
behavior jointly, instead of modeling each individually. To this end,
we formulated a family of multiresponse correlation functions,
suitable for prediction of several response variables that are
observed simultaneously from the same computer simulation. Using a set
of test functions with varying degrees of correlation, we compared the
performance of MR-BSM against metamodels built individually for each
response. Our results indicate that MR-BSM outperforms individual
metamodels in 53% to 75% of the test cases, though the relative
performance depends on the sample size, sampling scheme and the actual
correlation among the observed response values. In addition, the
relative performance of MR-BSM versus individual metamodels was
contingent upon the ability to select an appropriate
covariance/correlation function for each application, a task for which
a modified version of Akaike's Information Criterion was observed to
be inadequate. [DOI: 10.1115/1.4006996]

2012


* 170(<-428): Multiobjective global surrogate modeling, dealing with the 5-percent problem

When dealing with computationally expensive simulation codes or
process measurement data, surrogate modeling methods are firmly
established as facilitators for design space exploration, sensitivity
analysis, visualization, prototyping and optimization. Typically the
model parameter (=hyperparameter) optimization problem as part of
global surrogate modeling is formulated in a single objective
way. Models are generated according to a single objective
(accuracy). However, this requires an engineer to determine a single
accuracy target and measure upfront, which is hard to do if the
behavior of the response is unknown. Likewise, the different outputs
of a multi-output system are typically modeled separately by
independent models. Again, a multiobjective approach would benefit the
domain expert by giving information about output correlation and
enabling automatic model type selection for each output
dynamically. With this paper the authors attempt to increase awareness
of the subtleties involved and discuss a number of solutions and
applications. In particular, we present a multiobjective framework for
global surrogate model generation to help tackle both problems and
that is applicable in both the static and sequential design (adaptive
sampling) case.

2010


* 211(<-324): Mobility Timing for Agent Communities, a Cue for Advanced Connectionist Systems

We introduce a wait-and-chase scheme that models the contact times
between moving agents within a connectionist construct. The idea that
elementary processors move within a network to get a proper position
is borne out both by biological neurons in the brain morphogenesis and
by agents within social networks. From the former, we take inspiration
to devise a medium-term project for new artificial neural network
training procedures where mobile neurons exchange data only when they
are close to one another in a proper space (are in contact). From the
latter, we accumulate mobility tracks experience. We focus on the
preliminary step of characterizing the elapsed time between neuron
contacts, which results from a spatial process fitting in the family
of random processes with memory, where chasing neurons are
stochastically driven by the goal of hitting target neurons. Thus, we
add an unprecedented mobility model to the literature in the field,
introducing a distribution law of the intercontact times that merges
features of both negative exponential and Pareto distribution laws. We
give a constructive description and implementation of our model, as
well as a short analytical form whose parameters are suitably
estimated in terms of confidence intervals from experimental
data. Numerical experiments show the model and related inference tools
to be sufficiently robust to cope with two main requisites for its
exploitation in a neural network: the nonindependence of the observed
intercontact times and the feasibility of the model inversion problem
to infer suitable mobility parameters.

2011


* 218(<-472): Stochastic sampling design using a multi-objective genetic algorithm and adaptive neural networks

This paper presents a novel multi-objective genetic algorithm (MOGA)
based on the NSGA-II algorithm, which uses metamodels to determine
optimal sampling locations for installing pressure loggers in a water
distribution system (WDS) when parameter uncertainty is
considered. The new algorithm combines the multi-objective genetic
algorithm with adaptive neural networks (MOGA-ANN) to locate pressure
loggers. The purpose of pressure logger installation is to collect
data for hydraulic model calibration. Sampling design is formulated as
a two-objective optimization problem in this study. The objectives are
to maximize the calibrated model accuracy and to minimize the number
of sampling devices as a surrogate of sampling design cost. Calibrated
model accuracy is defined as the average of normalized traces of model
prediction covariance matrices, each of which is constructed from a
randomly generated sampling set of calibration parameter values. This
method of calculating model accuracy is called the 'full' fitness
model. Within the genetic algorithm search process, the full fitness
model is progressively replaced with the periodically (re)trained
adaptive neural network metamodel where (re)training is done using the
data collected by calling the full model. The methodology was first
tested on a hypothetical (benchmark) problem to configure the setting
requirement. Then the model was applied to a real case study. The
results show that significant computational savings can be achieved by
using the MOGA-ANN when compared to the approach where MOGA is linked
to the full fitness model. When applied to the real case study,
optimal solutions identified by MOGA-ANN are obtained 25 times faster
than those identified by the full model without significant decrease
in the accuracy of the final solution. (C) 2008 Elsevier Ltd. All
rights reserved.

2009


* 227(<-505): Learning based brain emotional intelligence as a new aspect for development of an alarm system

The multi criteria and purposeful prediction approach has been
introduced and is implemented by the fast and efficient behavioral
based brain emotional learning method. On the other side, the
emotional learning from brain model has shown good performance and is
characterized by high generalization property. New approach is
developed to deal with low computational and memory resources and can
be used with the largest available data sets. The scope of paper is to
reveal the advantages of emotional learning interpretations of brain
as a purposeful forecasting system designed to warning; and to make a
fair comparison between the successful neural (MLP) and neurofuzzy
(ANFIS) approaches in their best structures and according to
prediction accuracy, generalization, and computational complexity. The
auroral electrojet (AE) index are used as practical examples of
chaotic time series and introduced method used to make predictions and
warning of geomagnetic disturbances and geomagnetic storms based on AE
index.

2008


* 241(<-405): Neural network ensembles: immune-inspired approaches to the diversity of components

This work applies two immune-inspired algorithms, namely opt-aiNet and
omni-aiNet, to train multi-layer perceptrons (MLPs) to be used in the
construction of ensembles of classifiers. The main goal is to
investigate the influence of the diversity of the set of solutions
generated by each of these algorithms, and if these solutions lead to
improvements in performance when combined in ensembles. omni-aiNet is
a multi-objective optimization algorithm and, thus, explicitly
maximizes the components' diversity at the same time it minimizes
their output errors. The opt-aiNet algorithm, by contrast, was
originally designed to solve single-objective optimization problems,
focusing on the minimization of the output error of the
classifiers. However, an implicit diversity maintenance mechanism
stimulates the generation of MLPs with different weights, which may
result in diverse classifiers. The performances of opt-aiNet and
omni-aiNet are compared with each other and with that of a
second-order gradient-based algorithm, named MSCG. The results
obtained show how the different diversity maintenance mechanisms
presented by each algorithm influence the gain in performance obtained
with the use of ensembles.

2010


* 242(<-504): The Q-norm complexity measure and the minimum gradient method: A novel approach to the machine learning structural risk minimization problem

This paper presents a novel approach for dealing with the structural
risk minimization (SRM) applied to a general setting of the machine
learning problem. The formulation is based on the fundamental concept
that supervised learning is a bi-objective optimization problem in
which two conflicting objectives should be minimized. The objectives
are related to the empirical training error and the machine
complexity. In this paper, one general Q-norm method to compute the
machine complexity is presented, and, as a particular practical case,
the minimum gradient method (MGM) is derived relying on the definition
of the fat-shattering dimension. A practical mechanism for parallel
layer perceptron (PLP) network training, involving only quasi-convex
functions, is generated using the aforementioned
definitions. Experimental results on 15 different benchmarks are
presented, which show the potential of the proposed ideas.

2008


* 243(<-543): Controlling the parallel layer perceptron complexity using a multiobjective learning algorithm

This paper deals with the parallel layer perceptron (PLP) complexity
control, bias and variance dilemma, using a multiobjective (MOBJ)
training algorithm. To control the bias and variance the training
process is rewritten as a bi-objective problem, considering the
minimization of both training error and norm of the weight vector,
which is a measure of the network complexity. This method is, applied
to regression and classification problems and compared with several
other training procedures and topologies. The results show that the
PLP MOBJ training algorithm presents good generalization results,
outperforming traditional methods in the tested examples.

2007


* 244(<-548): Improving generalization of MLPs with sliding mode control and the Levenberg-Marquardt algorithm

A variation of the well-known Levenberg-Marquardt for training neural
networks is proposed in this work. The algorithm presented restricts
the norm of the weights vector to a preestablished norm value and
finds the minimum error solution for that norm value. The norm
constrain controls the neural networks degree of freedom. The more the
norm increases, the more flexible is the neural model. Therefore, more
fitted to the training set. A range of different norm solutions is
generated and the best generalization solution is selected according
to the validation set error. The results show the efficiency of the
algorithm in terms of generalization performance. (c) 2006 Elsevier
B.V. All rights reserved.

2007


* 246(<-560): Many-objective training of a multi-layer perceptron

In this paper, a many-objective training scheme for a multi-layer
feed-forward neural network is studied. In this scheme, each training
data set, or the average over sub-sets of the training data, provides
a single objective. A recently proposed group of evolutionary
many-objective optimization algorithms based on the NSGA-II algorithm
have been examined with respect to the handling of such problem
cases. A modified NSGA-II algorithm, using the norm of an individual
as a secondary ranking assignment method, appeared to give the best
results, even for a large number of objectives (up to 50 in this
study). However, there was no notable increase in performance against
the standard backpropagation algorithm, and a remarkable drop in
performance for higher-dimensional feature spaces (dimension 30 in
this study).

2007


* 247(<-645): Training neural networks with a multi-objective sliding mode control algorithm

This paper presents a new sliding mode control algorithm that is able
to guide the trajectory of a multi-layer perceptron within the plane
formed by the two objective functions: training set error and norm of
the weight vectors. The results show that the neural networks obtained
are able to generate an approximation to the Pareto set, from which an
improved generalization performance model is selected. (C) 2002
Elsevier Science B.V. All rights reserved.

2003


* 248(<-661): Recent advances in the MOBJ algorithm for training artificial neural networks.

This paper presents a new scheme for training MLPs which employs a
relaxation method for multi-objective optimization. The algorithm
works by obtaining a reduced set of solutions, from which the one with
the best generalization is selected. This approach allows balancing
between the training error and norm of network weight vectors, which
are the two objective functions of the multi-objective optimization
problem. The method is applied to classification and regression
problems and compared with Weight Decay (WD), Support Vector Machines
(SVMs) and standard Backpropagation (BP). It is shown that the
systematic procedure for training proposed results on good
generalization neural models, and outperforms traditional methods.

2001


* 249(<-665): Improving generalization of MLPs with multi-objective optimization

This paper presents a new learning scheme for improving generalization
of multilayer perceptrons. The algorithm uses a multi-objective
optimization approach to balance between the error of the training
data and the norm of network weight vectors to avoid overfitting. The
results are compared with support vector machines and standard
backpropagation. (C) 2000 Elsevier Science B.V. All rights reserved.

2000


* 251(<- 85): Time series forecasting by neural networks: A knee point-based multiobjective evolutionary algorithm approach

In this paper, we investigate the problem of time series forecasting using single hidden layer feedforward neural networks (SLFNs), which is optimized via multiobjective evolutionary algorithms. By utilizing the adaptive differential evolution (JADE) and the knee point strategy, a nondominated sorting adaptive differential evolution (NSJADE) and its improved version knee point-based NSJADE (KP-NSJADE) are developed for optimizing SLFNs. JADE aiming at refining the search area is introduced in nondominated sorting genetic algorithm II (NSGA-II). The presented NSJADE shows superiority on multimodal problems when compared with NSGA-II. Then NSJADE is applied to train SLFNs for time series forecasting. It is revealed that individuals with better forecasting performance in the whole population gather around the knee point. Therefore, KP-NSJADE is proposed to explore the neighborhood of the knee point in the objective space. And the simulation results of eight popular time series databases illustrate the effectiveness of our proposed algorithm in comparison with several popular algorithms. (C) 2014 Elsevier Ltd. All rights reserved.

2014


* 252(<-124): An analysis of accuracy-diversity trade-off for hybrid combined system with multiobjective predictor selection

This study examines the contribution of diversity under a
multi-objective context for the promotion of learners in an
evolutionary system that generates combinations of partially trained
learners. The examined system uses a grammar-driven genetic
programming to evolve hierarchical, multi-component combinations of
multilayer perceptrons and support vector machines for regression. Two
advances are studied. First, a ranking formula is developed for the
selection probability of the base learners. This formula incorporates
both a diversity measure and the performance of learners, and it is
tried over a series of artificial and real-world problems. Results
show that when the diversity of a learner is incorporated with equal
weights to the learner performance in the evolutionary selection
process, the system is able to provide statistically significantly
better generalization. The second advance examined is a substitution
phase for learners that are over-dominated, under a multi-objective
Pareto domination assessment scheme. Results here show that the
substitution does not improve significantly the system performance,
thus the exclusion of very weak learners, is not a compelling task for
the examined framework.

2014


* 255(<-649): Designing a phenotypic distance index for radial basis function neural networks

MultiObjective Evolutionary Algorithms (MOEAs) may cause a premature
convergence if the selective pressure is too large, so, MOEAs usually
incorporate a niche-formation procedure to distribute the population
over the optimal solutions and let the population evolve until the
Pareto-optimal region is completely explored. This niche-formation
scheme is based on a distance index that measures the similarity
between two solutions in order to decide if both may share the same
niche or not. The similarity criterion is usually based on a Euclidean
norm (given that the two solutions are represented with a vector),
nevertheless, as this paper will explain, this kind of metric is not
adequate for RBFNNs, thus being necessary a more suitable distance
index. The experimental results obtained show that a MOEA including
the proposed distance index is able to explore sufficiently the
Pareto-optimal region and provide the user a wide variety of
Pareto-optimal solutions.

2003


* 256(<-658): Hierarchical genetic algorithm for near optimal feedforward neural network design.

In this paper, we propose a genetic algorithm based design procedure
for a multi layer feed forward neural network. A hierarchical genetic
algorithm is used to evolve both the neural networks topology and
weighting parameters. Compared with traditional genetic algorithm
based designs for neural networks, the hierarchical approach addresses
several deficiencies, including a feasibility check highlighted in
literature. A multi objective cost function is used herein to optimize
the performance and topology of the evolved neural network
simultaneously. In the prediction of Mackey Glass chaotic time series,
the networks designed by the proposed approach prove to be
competitive, or even superior, to traditional learning algorithms for
the multi layer Perceptron networks and radial basis function
networks. Based upon the chosen cost function, a linear weight
combination decision making approach has been applied to derive an
approximated Pareto optimal solution set. Therefore, designing a set
of neural networks can be considered as solving a two objective
optimization problem.

2002


* 257(<-125): Robust parameter design optimization using Kriging, RBF and RBFNN with gradient-based and evolutionary optimization techniques

The dual response surface methodology is one of the most commonly used
approaches in robust parameter design to simultaneously optimize the
mean value and keep the variance minimum. The commonly used meta-model
is the quadratic polynomial regression. For highly nonlinear
input/output relationship, the accuracy of the fitted model is
limited. Many researchers recommended to use more complicated
surrogate models. In this study, three surrogate models will replace
the second order polynomial regression, namely, ordinary Kriging,
radial basis function approximation (RBF) and radial basis function
artificial neural network (RBFNN). The results show that the three
surrogate model present superior accuracy in comparison with the
quadratic polynomial regression. The mean squared error (MSE) approach
is widely used to link the mean and variance in one cost function. In
this study, a new approach has been proposed using multi-objective
optimization. The new approach has two main advantages over the
classical method. First, the conflicting nature of the two objectives
can be efficiently handled. Second, the decision maker will have a set
of Pareto-front design points to select from. (C) 2014 Elsevier
Inc. All rights reserved.

2014


* 260(<-453): Parallel multiobjective memetic RBFNNs design and feature selection for function approximation problems

The design of radial basis function neural networks (RBFNNs) still remains as a difficult task when they are applied to classification or to regression problems. The difficulty arises when the parameters that define an RBFNN have to be set, these are: the number of RBFs, the position of their centers and the length of their radii. Another issue that has to be faced when applying these models to real world applications is to select the variables that the RBFNN will use as inputs. The literature presents several methodologies to perform these two tasks separately, however, due to the intrinsic parallelism of the genetic algorithms, a parallel implementation will allow the algorithm proposed in this paper to evolve solutions for both problems at the same time. The parallelization of the algorithm not only consists in the evolution of the two problems but in the specialization of the crossover and mutation operators in order to evolve the different elements to be optimized when designing RBFNNs. The subjacent genetic algorithm is the non-sorting dominated genetic algorithm II (NSGA-II) that helps to keep a balance between the size of the network and its approximation accuracy in order to avoid overfitted networks. Another of the novelties of the proposed algorithm is the incorporation of local search algorithms in three stages of the algorithm: initialization of the population, evolution of the individuals and final optimization of the Pareto front. The initialization of the individuals is performed hybridizing clustering techniques with the mutual information (MI) theory to select the input variables. As the experiments will show, the synergy of the different paradigms and techniques combined by the presented algorithm allow to obtain very accurate models using the most significant input variables. (C) 2009 Published by Elsevier B.V.

2009


* 261(<-544): A new hybrid methodology for cooperative-coevolutionary optimization of radial basis function networks

This paper presents a new multiobjective cooperative-coevolutive hybrid algorithm for the design of a Radial Basis Function Network (RBFN). This approach codifies a population of Radial Basis Functions (RBFs) (hidden neurons), which evolve by means of cooperation and competition to obtain a compact and accurate RBFN. To evaluate the significance of a given RBF in the whole network, three factors have been proposed: the basis function's contribution to the network's output, the error produced in the basis function radius, and the overlapping among RBFs. To achieve an RBFN composed of RBFs with proper values for these quality factors our algorithm follows a multiobjective approach in the selection process. In the design process, a Fuzzy Rule Based System (FRBS) is used to determine the possibility of applying operators to a certain RBF. As the time required by our evolutionary algorithm to converge is relatively small, it is possible to get a further improvement of the solution found by using a local minimization algorithm (for example, the Levenberg-Marquardt method). In this paper the results of applying our methodology to function approximation and time series prediction problems are also presented and compared with other alternatives proposed in the bibliography.

2007


* 262(<-638): Multiobjective evolutionary optimization of the size, shape, and position parameters of radial basis function networks for function approximation

This paper presents a multiobjective evolutionary algorithm to optimize radial basis function neural networks (RBFNNs) in order to approach target functions from a set of input-output pairs. The procedure allows the application of heuristics to improve the solution of the problem at hand by including some new genetic operators in the evolutionary process. These new operators are based on two well-known matrix transformations: singular value decomposition (SVD) and orthogonal least squares (OLS), which have been used to define new mutation operators that produce local or global modifications in the radial basis functions (RBFs) of the networks (the individuals in the population in the evolutionary, procedure). After analyzing the efficiency of the different operators, we have shown that the global mutation operators yield an improved procedure to adjust the parameters of the RBFNNs.

2003


* 263(<-204): Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems

This paper presents a new multiobjective evolutionary algorithm applied to a radial basis function (RBF) network design based on multiobjective particle swarm optimization augmented with local search features. The algorithm is named the memetic multiobjective particle swarm optimization RBF network (MPSON) because it integrates the accuracy and structure of an RBF network. The proposed algorithm is implemented on two-class and multiclass pattern classification problems with one complex real problem. The experimental results indicate that the proposed algorithm is viable, and provides an effective means to design multiobjective RBF networks with good generalization capability and compact network structure. The accuracy and complexity of the network obtained by the proposed algorithm are compared with the memetic non-dominated sorting genetic algorithm based RBF network (MGAN) through statistical tests. This study shows that MPSON generates RBF networks coming with an appropriate balance between accuracy and simplicity, outperforming the other algorithms considered. (C) 2013 Elsevier Inc. All rights reserved.

2013


* 265(<-325): Memetic Elitist Pareto Differential Evolution algorithm based Radial Basis Function Networks for classification problems

This paper presents a new multi-objective evolutionary hybrid algorithm for the design of Radial Basis Function Networks (RBFNs) for classification problems. The algorithm, MEPDEN, Memetic Elitist Pareto evolutionary approach based on the Non-dominated Sorting Differential Evolution (NSDE) multiobjective evolutionary algorithm which has been adapted to design RBFNs, where the NSDE algorithm is augmented with a local search that uses the Back-propagation algorithm. The MEPDEN is tested on two-class and multiclass pattern classification problems. The results obtained in terms of Mean Square Error (MSE), number of hidden nodes, accuracy (ACC), sensitivity (SEN), specificity (SPE) and Area Under the receiver operating characteristics Curve (AUC), show that the proposed approach is able to produce higher prediction accuracies with much simpler network structures. The accuracy and complexity of the network obtained by the proposed algorithm are compared with Memetic Eilitist Pareto Non-dominated Sorting Genetic Algorithm based RBFN (MEPGAN) through statistical tests. This study showed that MEPDEN obtains RBFNs with an appropriate balance between accuracy and simplicity, outperforming the other method considered. (C) 2011 Elsevier B.V. All rights reserved.

2011


* 266(<-378): Memetic Pareto Evolutionary Artificial Neural Networks to determine growth/no-growth in predictive microbiology

The main objective of this work is to automatically design neural
network models with sigmoid basis units for binary classification
tasks. The classifiers that are obtained achieve a double objective: a
high classification level in the dataset and a high classification
level for each class. We present MPENSGA2, a Memetic Pareto
Evolutionary approach based on the NSGA2 multiobjective evolutionary
algorithm which has been adapted to design Artificial Neural Network
models, where the NSGA2 algorithm is augmented with a local search
that uses the improved Resilient Backpropagation with
backtracking-IRprop+ algorithm. To analyze the robustness of this
methodology, it was applied to four complex classification problems in
predictive microbiology to describe the growth/no-growth interface of
food-borne microorganisms such as Listeria monocytogenes, Escherichia
coli R31, Staphylococcus aureus and Shigella flexneri. The results
obtained in Correct Classification Rate (CCR), Sensitivity (S) as the
minimum of sensitivities for each class, Area Under the receiver
operating characteristic Curve (AUC), and Root Mean Squared Error
(RMSE), show that the generalization ability and the classification
rate in each class can be more efficiently improved within a
multiobjective framework than within a single-objective framework. (C)
2009 Elsevier B.V. All rights reserved.

2011


* 268(<-414): Sensitivity Versus Accuracy in Multiclass Problems Using Memetic Pareto Evolutionary Neural Networks

This paper proposes a multiclassification algorithm using multilayer
perceptron neural network models. It tries to boost two conflicting
main objectives of multiclassifiers: a high correct classification
rate level and a high classification rate for each class. This last
objective is not usually optimized in classification, but is
considered here given the need to obtain high precision in each class
in real problems. To solve this machine learning problem, we use a
Pareto-based multiobjective optimization methodology based on a
memetic evolutionary algorithm. We consider a memetic Pareto
evolutionary approach based on the NSGA2 evolutionary algorithm
(MPENSGA2). Once the Pareto front is built, two strategies or
automatic individual selection are used: the best model in accuracy
and the best model in sensitivity ( extremes in the Pareto
front). These methodologies are applied to solve 17 classification
benchmark problems obtained from the University of California at
Irvine (UCI) repository and one complex real classification
problem. The models obtained show high accuracy and a high
classification rate for each class.

2010


* 269(<-379): Radial basis function network based on time variant multi-objective particle swarm optimization for medical diseases diagnosis

This paper proposes an adaptive evolutionary radial basis function
(RBF) network algorithm to evolve accuracy and connections (centers
and weights) of RBF networks simultaneously. The problem of hybrid
learning of RBF network is discussed with the multi-objective
optimization methods to improve classification accuracy for medical
disease diagnosis. In this paper, we introduce a time variant
multi-objective particle swarm optimization(TVMOPSO) of radial basis
function (RBF) network for diagnosing the medical diseases. This study
applied RBF network training to determine whether RBF networks can be
developed using TVMOPSO, and the performance is validated based on
accuracy and complexity. Our approach is tested on three standard data
sets from UCI machine learning repository. The results show that our
approach is a viable alternative and provides an effective means to
solve multi-objective RBF network for medical disease diagnosis. It is
better than RBF network based on MOPSO and NSGA-II, and also
competitive with other methods in the literature. (C) 2010 Elsevier
B.V. All rights reserved.

2011


* 270(<-418): An Adaptive Multiobjective Approach to Evolving ART Architectures

In this paper, we present the evolution of adaptive resonance theory
(ART) neural network architectures (classifiers) using a
multiobjective optimization approach. In particular, we propose the
use of a multiobjective evolutionary approach to simultaneously evolve
the weights and the topology of three well-known ART architectures;
fuzzy ARTMAP (FAM), ellipsoidal ARTMAP (EAM), and Gaussian ARTMAP
(GAM). We refer to the resulting architectures as MO-GFAM, MO-GEAM,
and MO-GGAM, and collectively as MO-GART. The major advantage of
MO-GART is that it produces a number of solutions for the
classification problem at hand that have different levels of merit
[accuracy on unseen data (generalization) and size (number of
categories created)]. MO-GART is shown to be more elegant (does not
require user intervention to define the network parameters), more
effective (of better accuracy and smaller size), and more efficient
(faster to produce the solution networks) than other ART neural
network architectures that have appeared in the
literature. Furthermore, MO-GART is shown to be competitive with other
popular classifiers, such as classification and regression tree (CART)
and support vector machines (SVMs).

2010


* 271(<-578): Applications of multi-objective structure optimization

We present applications of multi-objective evolutionary optimization
of feed-forward neural networks (NN) to two real world problems, car
and face classification. The possibly conflicting requirements on the
NNs are speed and classification accuracy, both of which can enhance
the embedding systems as a whole. We compare the results to the
outcome of a greedy optimization heuristic (magnitude-based pruning)
coupled with a multi-objective performance evaluation. For the car
classification problem, magnitude-based pruning yields competitive
results, whereas for the more difficult face classification, we find
that the evolutionary approach to NN design is clearly preferable. (c)
2006 Elsevier B.V. All rights reserved.

2006


* 274(<-113): Metrics to guide a multi-objective evolutionary algorithm for ordinal classification

Ordinal classification or ordinal regression is a classification
problem in which the labels have an ordered arrangement between
them. Due to this order, alternative performance evaluation metrics
are need to be used in order to consider the magnitude of errors. This
paper presents a study of the use of a multi-objective optimization
approach in the context of ordinal classification. We contribute a
study of ordinal classification performance metrics, and propose a new
performance metric, the maximum mean absolute error (MMAE). MMAE
considers per-class distribution of patterns and the magnitude of the
errors, both issues being crucial for ordinal regression problems. In
addition, we empirically show that some of the performance metrics are
competitive objectives, which justify the use of multi-objective
optimization strategies. In our case, a multi-objective evolutionary
algorithm optimizes an artificial neural network ordinal model with
different pairs of metric combinations, and we conclude that the pair
of the mean absolute error (MAE) and the proposed MMAE is the most
favourable. A study of the relationship between the metrics of this
proposal is performed, and the graphical representation in the
two-dimensional space where the search of the evolutionary algorithm
takes place is analysed. The results obtained show a good
classification performance, opening new lines of research in the
evaluation and model selection of ordinal classifiers. (C) 2014
Elsevier B.V. All rights reserved.

2014


* 276(<-334): Weighting Efficient Accuracy and Minimum Sensitivity for Evolving Multi-Class Classifiers

Recently, a multi-objective Sensitivity-Accuracy based methodology has
been proposed for building classifiers for multi-class problems. This
technique is especially suitable for imbalanced and multi-class
datasets. Moreover, the high computational cost of multi-objective
approaches is well known so more efficient alternatives must be
explored. This paper presents an efficient alternative to the Pareto
based solution when considering both Minimum Sensitivity and Accuracy
in multi-class classifiers. Alternatives are implemented by extending
the Evolutionary Extreme Learning Machine algorithm for training
artificial neural networks. Experiments were performed to select the
best option after considering alternative proposals and related
methods. Based on the experiments, this methodology is competitive in
Accuracy, Minimum Sensitivity and efficiency.

2011


* 277(<-561): A cooperative constructive method for neural networks for pattern recognition

In this paper, we propose a new constructive method, based on cooperative coevolution, for designing automatically the structure of a neural network for classification. Our approach is based on a modular construction of the neural network by means of a cooperative evolutionary process. This process benefits from the advantages of coevolutionary computation as well as the advantages of constructive methods. The proposed methodology can be easily extended to work with almost any kind of classifier. The evaluation of each module that constitutes the network is made using a multiobjective method. So, each new module can be evaluated in a comprehensive way, considering different aspects, such as performance, complexity, or degree of cooperation with the previous modules of the network. In this way, the method has the advantage of considering not only the performance of the networks, but also other features. The method is tested on 40 classification problems from the UCI machine learning repository with very good performance. The method is thoroughly compared with two other constructive methods, cascade correlation and GMDH networks, and other classification methods, namely, SVM, C4.5, and k nearest-neighbours, and an ensemble of neural networks constructed using four different methods. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

2007


* 278(<-599): Cooperative coevolution of artificial neural network ensembles for pattern classification

This paper presents a cooperative coevolutive approach for designing
neural network ensembles. Cooperative coevolution is a recent paradigm
in evolutionary computation that allows the effective modeling of
cooperative environments. Although theoretically, a single neural
network with a sufficient number of neurons in the hidden layer would
suffice to solve any problem, in practice many real-world problems are
too hard to construct the appropriate network that solve them. In such
problems, neural network ensembles are a successful
alternative. Nevertheless, the design of neural network ensembles is a
complex task. In this paper, we propose a general framework for
designing neural network ensembles by means of cooperative
coevolution. The proposed model has two main objectives: first, the
improvement of the combination of the trained individual networks;
second, the cooperative evolution of such networks, encouraging
collaboration among them, instead of a separate training of each
network. In order to favor the cooperation of the networks, each
network is evaluated throughout the evolutionary process using a
multiobjective method. For each network, different objectives are
defined, considering not only its performance in the given problem,
but also its cooperation with the rest of the networks. In addition, a
population of ensembles is evolved, improving the combination of
networks and obtaining subsets of networks to form ensembles that
perform better than the combination of all the evolved networks. The
proposed model is applied to ten real-world classification problems of
a very different nature from the UCI machine learning repository and
proben1 benchmark set. In all of them the performance of the model is
better than the performance of standard ensembles in terms of
generalization error. Moreover, the size of the obtained ensembles is
also smaller.

2005


* 283(<-569): Feature selection for ensembles applied to handwriting recognition

Feature selection for ensembles has shown to be an effective strategy
for ensemble creation clue to its ability of producing good subsets of
features, which make the classifiers of the ensemble disagree on
difficult cases. In this paper we present an ensemble feature
selection approach based on a hierarchical multi-objective genetic
algorithm. The underpinning paradigm is the "overproduce and
choose". The algorithm operates in two levels. Firstly, it performs
feature selection in order to generate a set of classifiers and then
it chooses the best team of classifiers. In order to show its
robustness, the method is evaluated in two different contexts:
supervised and unsupervised feature selection. In the former, we have
considered the problem of handwritten digit recognition and used three
different feature sets and multi-layer perceptron neural networks as
classifiers. In the latter, we took into account the problem of
handwritten month word recognition and used three different feature
sets and hidden Markov models as classifiers. Experiments and
comparisons with classical methods, such as Bagging and Boosting,
demonstrated that the proposed methodology brings compelling
improvements when classifiers have to work with very low error
rates. Comparisons have been done by considering the recognition rates
only.

2006


* 284(<-210): A new approach to radial basis function-based polynomial neural networks: analysis and design

In this study, we introduce a new topology of radial basis
function-based polynomial neural networks (RPNNs) that is based on a
genetically optimized multi-layer perceptron with radial polynomial
neurons (RPNs). This paper offers a comprehensive design methodology
involving various mechanisms of optimization, especially fuzzy C-means
(FCM) clustering and particle swarm optimization (PSO). In contrast to
the typical architectures encountered in polynomial neural networks
(PNNs), our main objective is to develop a topology and establish a
comprehensive design strategy of RPNNs: (a) The architecture of the
proposed network consists of radial polynomial neurons (RPN). These
neurons are fully reflective of the structure encountered in numeric
data, which are granulated with the aid of FCM clustering. RPN dwells
on the concepts of a collection of radial basis function and the
function-based nonlinear polynomial processing. (b) The PSO-based
design procedure being applied to each layer of the RPNN leads to the
selection of preferred nodes of the network whose local parameters
(such as the number of input variables, a collection of the specific
subset of input variables, the order of the polynomial, the number of
clusters of FCM clustering, and a fuzzification coefficient of the FCM
method) are properly adjusted. The performance of the RPNN is
quantified through a series of experiments where we use several
modeling benchmarks, namely a synthetic three-dimensional data and
learning machine data (computer hardware data, abalone data, MPG data,
and Boston housing data) already used in neuro-fuzzy modeling. A
comparative analysis shows that the proposed RPNN exhibits higher
accuracy in comparison with some previous models available in the
literature.

2013


* 286(<-576): Using a multi-objective genetic algorithm for SVM construction

Support Vector Machines are kernel machines useful for classification
and regression problems. in this paper, they are used for non-linear
regression of environmental data. From a structural point of view,
Support Vector Machines are particular Artificial Neural Networks and
their training paradigm has some positive implications. in fact, the
original training approach is useful to overcome the curse of
dimensionality and too strict assumptions on statistics of the errors
in data. Support Vector machines and Radial Basis Function Regularised
Networks are presented within a common structural framework for
non-linear regression in order to emphasise the training strategy for
support vector machines and to better explain the multi-objective
approach in support vector machines' construction. A support vector
machine's performance depends on the kernel parameter, input selection
and epsilon-tube optimal dimension. These will be used as decision
variables for the evolutionary strategy based on a Genetic Algorithm,
which exhibits the number of support vectors, for the capacity of
machine, and the fitness to a validation subset, for the model
accuracy in mapping the underlying physical phenomena, as objective
functions. The strategy is tested on a case study dealing with
groundwater modelling, based on time series (past measured rainfalls
and levels) for level predictions at variable time horizons.

2006


* 287(<-209): A multi-objective micro genetic ELM algorithm

The extreme learning machine (ELM) is a methodology for learning
single-hidden layer feedforward neural networks (SLFN) which has been
proved to be extremely fast and to provide very good generalization
performance. ELM works by randomly choosing the weights and biases of
the hidden nodes and then analytically obtaining the output weights
and biases for a SLFN with the number of hidden nodes previously
fixed. In this work, we develop a multi-objective micro genetic ELM
(mu G-ELM) which provides the appropriate number of hidden nodes for
the problem being solved as well as the weights and biases which
minimize the MSE. The multi-objective algorithm is conducted by two
criteria: the number of hidden nodes and the mean square error
(MSE). Furthermore, as a novelty, mu G-ELM incorporates a regression
device in order to decide whether the number of hidden nodes of the
individuals of the population should be increased or decreased or
unchanged. In general, the proposed algorithm reaches better errors by
also implying a smaller number of hidden nodes for the data sets and
competitors considered. (C) 2013 Elsevier B.V. All rights reserved.

2013


* 288(<-424): A multi-objective memetic and hybrid methodology for optimizing the parameters and performance of artificial neural networks

The use of artificial neural networks implies considerable time spent
choosing a set of parameters that contribute toward improving the
final performance. Initial weights, the amount of hidden nodes and
layers, training algorithm rates and transfer functions are normally
selected through a manual process of trial-and-error that often fails
to find the best possible set of neural network parameters for a
specific problem. This paper proposes an automatic search methodology
for the optimization of the parameters and performance of neural
networks relying on use of Evolution Strategies, Particle Swarm
Optimization and concepts from Genetic Algorithms corresponding to the
hybrid and global search module. There is also a module that refers to
local searches, including the well-known Multilayer Perceptrons,
Back-propagation and the Levenberg-Marquardt training algorithms. The
methodology proposed here performs the search using the aforementioned
parameters in an attempt to optimize the networks and
performance. Experiments were performed and the results proved the
proposed method to be better than trial-and-error and other methods
found in the literature. Crown Copyright (C) 2009 Published by
Elsevier B.V. All rights reserved.

2010


* 293(<-170): MULTI-OBJECTIVE OPTIMIZATION BY MEANS OF MULTI-DIMENSIONAL MLP NEURAL NETWORKS

In this paper, a multi-layer perceptron (MLP) neural network (NN) is
put forward as an efficient tool for performing two tasks: 1)
optimization of multi-objective problems and 2) solving a non-linear
system of equations. In both cases, mathematical functions which are
continuous and partially bounded are involved. Previously, these two
tasks were performed by recurrent neural networks and also strong
algorithms like evolutionary ones. In this study, multi-dimensional
structure in the output layer of the MLP-NN, as an innovative method,
is utilized to implicitly optimize the multivariate functions under
the network energy optimization mechanism. To this end, the activation
functions in the output layer are replaced with the multivariate
functions intended to be optimized. The effective training parameters
in the global search are surveyed. Also, it is demonstrated that the
MLP-NN with proper dynamic learning rate is able to find globally
optimal solutions. Finally, the efficiency of the MLP-NN in both
aspects of speed and power is investigated by some well-known
experimental examples. In some of these examples, the proposed method
gives explicitly better globally optimal solutions compared to that of
the other references and also shows completely satisfactory results in
other experiments.

2014


* 295(<-659): Improving neural networks generalization with new constructive and pruning methods

This paper presents a new constructive method and pruning approaches
to control the design of Multi-Layer Perceptron (MLP) without loss in
performance. The proposed methods use a multi-objective approach to
guarantee generalization. The constructive approach searches for an
optimal solution according to the pareto set shape with increasing
number of hidden nodes. The pruning methods are able to simplify the
network topology and to identify linear connections between the inputs
and outputs of the neural model. Topology information and validation
sets are used.

2002


* 301(<-536): Learning multicriteria fuzzy classification method PROAFTN from data

In this paper, we present a new methodology for learning parameters of
multiple criteria classification method PROAFTN from data. There are
numerous representations and techniques available for data mining, for
example decision trees, rule bases, artificial neural networks,
density estimation, regression and clustering. The PROAFTN method
constitutes another approach for data mining. It belongs to the class
of supervised learning algorithms and assigns membership degree of the
alternatives to the classes. The PROAFTN method requires the
elicitation of its parameters for the purpose of
classification. Therefore, we need an automatic method that helps us
to establish these parameters from the given data with minimum
classification errors. Here, we propose variable neighborhood search
metaheuristic for getting these parameters. The performances of the
newly proposed method were evaluated using 10 cross validation
technique. The results are compared with those obtained by other
classification methods previously reported on the same data. It
appears that the solutions of substantially better quality are
obtained with proposed method than with these former ones. Crown
Copyright (c) 2005 Published by Elsevier Ltd. All rights reserved.

2007


* 302(<-546): Fuzzy integral-based perceptron for two-class pattern classification problems

The single-layer perceptron with single output node is a well-known
neural network for two-class classification problems. Furthermore, the
sigmoid or logistic function is usually used as the activation
function in the output neuron. A critical step is to compute the sum
of the products of the connection weights with the corresponding
inputs, which indicates the assumption of additivity among individual
variables. Unfortunately, because the input variables are not always
independent of each other, an assumption of additivity may not be
reasonable enough. In this paper, the inner product can be replaced
with an aggregation value obtained by a useful fuzzy integral by
viewing each of the connection weights as a value of a lambda-fuzzy
measure for the corresponding variable. A genetic algorithm is then
employed to obtain connection weights by maximizing the number of
correctly classified training patterns and minimizing the errors
between the actual and desired outputs of individual training
patterns. The experimental results further demonstrate that the
proposed method outperforms the traditional single-layer perceptron
and performs well in comparison with other fuzzy or non-fuzzy
classification methods. (c) 2006 Elsevier Inc. All rights reserved.

2007


* 303(<-590): Training of multilayer perceptron neural networks by using cellular genetic algorithms

This paper deals with a method for training neural networks by using cellular genetic algorithms (CGA). This method was implemented as software, CGANN-Trainer, which was used to generate binary classifiers for recognition of patterns associated with breast cancer images in a multi-objective optimization problem. The results reached by the CGA with the Wisconsin Breast Cancer Database, and the Wisconsin Diagnostic Breast Cancer Database, were compared with some other methods previously reported using the same databases, proving to be an interesting alternative.

2006


* 304(<-632): Multicriteria fuzzy classification procedure PROCFTN: methodology and medical application

In this paper, we introduce a new classification procedure for
assigning objects to predefined classes, named PROCFTN. This procedure
is based on a fuzzy scoring function for choosing a subset of
prototypes, which represent the closest resemblance with an object to
be assigned. It then applies the majority-voting rule to assign an
object to a class. We also present a medical application of this
procedure as an aid to assist the diagnosis of central nervous system
tumours. The results are compared with those obtained by other
classification methods, reported on the same data set, including
decision tree, production rules, neural network, k nearest neighbor,
multilayer perceptron and logistic regression. Our results are very
encouraging and show that the multicriteria decision analysis approach
can be successfully used to help medical diagnosis. Crown Copyright
(C) 2003 Published by Elsevier B.V. All rights reserved.

2004


* 315(<-685): ARTIFICIAL NEURAL NETWORKS VERSUS NATURAL NEURAL NETWORKS - A CONNECTIONIST PARADIGM FOR PREFERENCE ASSESSMENT

Preference is an essential ingredient in all decision processes. This
paper presents a new connectionist paradigm for preference assessment
in a general multicriteria decision setting. A general structure of an
artificial neural network for representing two specified prototypes of
preference structures is discussed. An interactive preference
assessment procedure and an autonomous learning algorithm based on a
novel scheme of supervised learning are proposed. Operating
characteristics of the proposed paradigm are also illustrated through
detailed results of numerical simulations.

1994


* 316(<- 52): Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data

In industrial process control, there may be multiple performance
objectives, depending on salient features of the input-output
data. Aiming at this situation, this paper proposes multiple
actor-critic structures to obtain the optimal control via input-output
data for unknown nonlinear systems. The shunting inhibitory artificial
neural network (SIANN) is used to classify the input-output data into
one of several categories. Different performance measure functions may
be defined for disparate categories. The approximate dynamic
programming algorithm, which contains model module, critic network,
and action network, is used to establish the optimal control in each
category. A recurrent neural network (RNN) model is used to
reconstruct the unknown system dynamics using input-output data. NNs
are used to approximate the critic and action networks,
respectively. It is proven that the model error and the closed unknown
system are uniformly ultimately bounded. Simulation results
demonstrate the performance of the proposed optimal control scheme for
the unknown nonlinear system.

2015


* 321(<-610): Evolutionary multi-objective optimization for simultaneous generation of signal-type and symbol-type representations

It has been a controversial issue in the research of cognitive science
and artificial intelligence whether signal-type representations
(typically connectionist networks) or symbol-type representations
(e.g., semantic networks, production systems) should be
used. Meanwhile, it has also been recognized that both types of
information representations might exist in the human brain. In
addition, symbol-type representations are often very helpful in
gaining insights into unknown systems. For these reasons,
comprehensible symbolic rules need to be extracted from trained neural
networks. In this paper, an evolutionary multi-objective algorithm is
employed to generate multiple models that facilitate the generation of
signal-type and symbol-type representations simultaneously. It is
argued that one main difference between signal-type and symbol-type
representations lies in the fact that the signal-type representations,
are models of a higher complexity (fine representation), whereas
symbol-type representations are models of a lower complexity (coarse
representation). Thus, by generating models with a spectrum of model
complexity, we are able to obtain a population of models of both
signal-type and symbol-type quality, although certain post-processing
is needed to get a fully symbol-type representation. An illustrative
example is given on generating neural networks for the breast cancer
diagnosis benchmark problem.

2005


* 324(<-377): A multi-objective artificial immune algorithm for parameter optimization in support vector machine

Support vector machine (SVM) is a classification method based on the
structured risk minimization principle. Penalize, C; and kernel, sigma
parameters of SVM must be carefully selected in establishing an
efficient SVM model. These parameters are selected by trial and error
or man's experience. Artificial immune system (AIS) can be defined as
a soft computing method inspired by theoretical immune system in order
to solve science and engineering problems. A multi-objective
artificial immune algorithm has been used to optimize the kernel and
penalize parameters of SVM in this paper. In training stage of SVM,
multiple solutions are found by using multi-objective artificial
immune algorithm and then these parameters are evaluated in test
stage. The proposed algorithm is applied to fault diagnosis of
induction motors and anomaly detection problems and successful results
are obtained. (c) 2009 Elsevier B.V. All rights reserved.

2011


* 326(<-687): USING GENETIC ALGORITHMS FOR AN ARTIFICIAL NEURAL-NETWORK MODEL INVERSION

Genetic algorithms (GAs) and artificial neural networks (ANNs) are
techniques for optimization and learning, respectively, which both
have been adopted from nature. Their main advantage over traditional
techniques is the relatively better performance when applied to
complex relations. GAs and ANNs are both self-learning systems, i.e.,
they do not require any background knowledge from the creator. In this
paper, we describe the performance of a GA that finds hypothetical
physical structures of poly(ethylene terephthalate) (PET) yarns
corresponding to a certain combination of mechanical and shrinkage
properties. This GA uses a validated ANN that has been trained for the
complex relation between structure and properties of PET. This
technique was tested by comparing the optimal points found by the GA
with known experimental data under a variety of multi-criteria
conditions.

1993


* 329(<-271): Convergence analysis of sliding mode trajectories in multi-objective neural networks learning

The Pareto-optimality concept is used in this paper in order to
represent a constrained set of solutions that are able to trade-off
the two main objective functions involved in neural networks
supervised learning: data-set error and network complexity. The neural
network is described as a dynamic system having error and complexity
as its state variables and learning is presented as a process of
controlling a learning trajectory in the resulting state space. In
order to control the trajectories, sliding mode dynamics is imposed to
the network. It is shown that arbitrary learning trajectories can be
achieved by maintaining the sliding mode gains within their
convergence intervals. Formal proofs of convergence conditions are
therefore presented. The concept of trajectory learning presented in
this paper goes further beyond the selection of a final state in the
Pareto set, since it can be reached through different trajectories and
states in the trajectory can be assessed individually against an
additional objective function. (c) 2012 Elsevier Ltd. All rights
reserved.

2012


* 332(<-597): Intelligent interactive multiobjective optimization method and its application to reliability optimization

In most practical situations involving reliability optimization, there
are several mutually conflicting goals such as maximizing the system
reliability and minimizing the cost, weight and volume. This paper
develops an effective multiobjective optimization method, the
Intelligent Interactive Multiobjective Optimization Method (IIMOM). In
IIMOM, the general concept of the model parameter vector is
proposed. From a practical point of view, a designer's preference
structure model is built using Artificial Neural Networks (ANNs) with
the model parameter vector as the input and the preference information
articulated by the designer over representative samples from the
Pareto frontier as the desired output. Then with the ANN model of the
designer's preference structure as the objective, an optimization
problem is solved to search for improved solutions and guide the
interactive optimization process intelligently. IIMOM is applied to
the reliability optimization problem of a multi-stage mixed system
with five different value functions simulating the designer in the
solution evaluation process. The results illustrate that IIMOM is
effective in capturing different kinds of preference structures of the
designer, and it provides a complete and effective solution for
medium- and small-scale multiobjective optimization problems.

2005


* 336(<-644): Simulation metamodeling through artificial neural networks

Simulation metamodeling has been a major research field during the
last decade. The main objective has been to provide robust, fast
decision support aids to enhance the overall effectiveness of
decision-making processes. This paper discusses the importance of
simulation metamodeling through artificial neural networks (ANNs), and
provides general guidelines for the development of ANN-based
simulation metamodels. Such guidelines were successfully applied in
the development of two ANNs trained to estimate the manufacturing lead
times (MLT) for orders simultaneously processed in a four-machine job
shop. The design of intelligent systems such as ANNs may help to avoid
some of the drawbacks of traditional computer simulation. Metamodels
offer significant advantages regarding time consumption and simplicity
to evaluate multi-criteria situations. Their operation is notoriously
fast compared to the time required to operate conventional simulation
packages. (C) 2003 Elsevier Ltd. All rights reserved.

2003


* 340(<-640): Speeding up backpropagation using multiobjective evolutionary algorithms

The use of backpropagation for training artificial neural networks
(ANNs) is usually associated with a long training process. The user
needs to experiment with a number of network architectures; with
larger networks, more computational cost in terms of training time is
required. The objective of this letter is to present an optimization
algorithm, comprising a multiobjective evolutionary algorithm and a
gradient-based local search. In the rest of the letter, this is
referred to as the memetic Pareto artificial neural network algorithm
for training ANNs. The evolutionary approach is used to train the
network and simultaneously optimize its architecture. The result is a
set of networks, with each network in the set attempting to optimize
both the training error and the architecture. We also present a
self-adaptive version with lower computational cost. We show
empirically that the proposed method is capable of reducing the
training time compared to gradient-based techniques.

2003


* 343(<-675): Pattern classification by linear goal programming and its extensions

Pattern classification is one of the main themes in pattern
recognition, and has been tackled by several methods such as the
statistic one, artificial neural networks, mathematical programming
and so on. Among them, the multi-surface method proposed by
Mangasarian is very attractive, because it can provide an exact
discrimination function even for highly nonlinear problems without any
assumption on the data distribution. However, the method often causes
many slits on the discrimination curve. In other words, the piecewise
linear discrimination curve is sometimes too complex resulting in a
poor generalization ability. In this paper, several trials in order to
overcome the difficulties of the multi-surface method are
suggested. One of them is the utilization of goal programming in which
the auxiliary linear programming problem is formulated as a goal
programming in order to get as simple discrimination curves as
possible. Another one is to apply fuzzy programming by which we can
get fuzzy discrimination curves with gray zones. In addition, it will
be shown that using the suggested methods, the additional learning can
be easily made. These features of the methods make the discrimination
more realistic. The effectiveness of the methods is shown on the basis
of some applications.

1998


* 344(<-404): Multiple criteria optimization-based data mining methods and applications: a systematic survey

Support Vector Machine, an optimization technique, is well known in
the data mining community. In fact, many other optimization techniques
have been effectively used in dealing with data separation and
analysis. For the last 10 years, the author and his colleagues have
proposed and extended a series of optimization-based classification
models via Multiple Criteria Linear Programming (MCLP) and Multiple
Criteria Quadratic Programming (MCQP). These methods are different
from statistics, decision tree induction, and neural networks. The
purpose of this paper is to review the basic concepts and frameworks
of these methods and promote the research interests in the data mining
community. According to the evolution of multiple criteria
programming, the paper starts with the bases of MCLP. Then, it further
discusses penalized MCLP, MCQP, Multiple Criteria Fuzzy Linear
Programming (MCFLP), Multi-Class Multiple Criteria Programming
(MCMCP), and the kernel-based Multiple Criteria Linear Program, as
well as MCLP-based regression. This paper also outlines several
applications of Multiple Criteria optimization-based data mining
methods, such as Credit Card Risk Analysis, Classification of HIV-1
Mediated Neuronal Dendritic and Synaptic Damage, Network Intrusion
Detection, Firm Bankruptcy Prediction, and VIP E-Mail Behavior
Analysis.

2010


* 346(<-652): Multi-objective cooperative coevolution of artificial neural networks (multi-objective cooperative networks)

In this paper we present a cooperative coevolutive model for the
evolution of neural network topology and weights, called
MOBNET. MOBNET evolves subcomponents that must be combined in order to
form a network, instead of whole networks. The problem of assigning
credit to the subcomponents is approached as a multi-objective
optimization task. The subcomponents in a cooperative coevolutive
model must fulfill different criteria to be useful, these criteria
usually conflict with each other. The problem of evaluating the
fitness on an individual based on many criteria that must be optimized
together can be approached as a multi-criteria optimization problems,
so the methods from multi-objective optimization offer the most
natural way to solve the problem. In this work we show how using
several objectives for every subcomponent and evaluating its fitness
as a multi-objective optimization problem, the performance of the
model is highly competitive. MOBNET is compared with several standard
methods of classification and with other neural network models in
solving four real-world problems, and it shows the best overall
performance of all classification methods applied. It also produces
smaller networks when compared to other models. The basic idea
underlying MOBNET is extensible to a more general model of
coevolutionary computation, as none of its features are exclusive of
neural networks design. There are many applications of cooperative
coevolution that could benefit from the multi-objective optimization
approach proposed in this paper. (C) 2002 Elsevier Science Ltd. All
fights reserved.

2002


* 347(<-362): Learning in the feed-forward random neural network: A critical review

The Random Neural Network (RNN) has received, since its inception in
1989, considerable attention and has been successfully used in a
number of applications. In this critical review paper we focus on the
feed-forward RNN model and its ability to solve classification
problems. In particular, we paid special attention to the RNN
literature related with learning algorithms that discover the RNN
interconnection weights, suggested other potential algorithms that can
be used to find the RNN interconnection weights, and compared the RNN
model with other neural-network based and non-neural network based
classifier models. In review, the extensive literature review and
experimentation with the RNN feed-forward model provided us with the
necessary guidance to introduce six critical review comments that
identify some gaps in the RNN's related literature and suggest
directions for future research. (C) 2010 Elsevier B.V. All rights
reserved.

2011


* 352(<-662): Evolutionary optimization of RBF networks.

One of the main obstacles to the widespread use of artificial neural
networks is the difficulty of adequately defining values for their
free parameters. This article discusses how Radial Basis Function
(RBF) networks can have their parameters defined by genetic
algorithms. For such, it presents an overall view of the problems
involved and the different approaches used to genetically optimize RBF
networks. A new strategy to optimize RBF networks using genetic
algorithms is proposed, which includes new representation, crossover
operator and the use of a multiobjective optimization
criterion. Experiments using a benchmark problem are performed and the
results achieved using this model are compared to those achieved by
other approaches.

2001


* 356(<-594): Stopping criteria for ensemble of evolutionary artificial neural networks

The formation of ensemble of artificial neural networks has attracted
attentions of researchers in the machine learning and statistical
inference domains. It has been shown that combining different neural
networks could improve the generalization ability of the learning
machine. One challenge is when to stop the training or evolution of
the neural networks to avoid overfitting. In this paper, we show that
different early stopping criteria based on (i) the minimum validation
fitness of the ensemble, and (ii) the minimum of the average
population validation fitness could generalize better than the
survival population in the last generation. The proposition was tested
on four different ensemble methods: (i) a simple ensemble method,
where each individual of the population (created and maintained by the
evolutionary process) is used as a committee member, (ii) ensemble
with island model as a diversity promotion mechanism, (iii) a recent
successful ensemble method namely ensemble with negative correlation
learning and (iv) an ensemble formed by applying multi-objective
optimization. The experimental results suggested that using minimum
validation fitness of the ensemble as an early stopping criterion is
beneficial. (C) 2005 Elsevier B.V. All rights reserved.

2005


* 369(<-499): Hybrid multiobjective evolutionary design for artificial neural networks

Evolutionary algorithms are a class of stochastic search methods that
attempts to emulate the biological process of evolution, incorporating
concepts of selection, reproduction, and mutation. In recent years,
there has been an increase in the use of evolutionary approaches in
the training of artificial neural networks (ANNs). While evolutionary
techniques for neural networks have shown to provide superior
performance over conventional training approaches, the simultaneous
optimization of network performance and architecture will almost
always result in a slow training process due to the added algorithmic
complexity. In this paper, we present a geometrical measure based on
the singular value decomposition (SVD) to estimate the necessary
number of neurons to be used in training a single-hidden-layer
feedforward neural network (SLFN). In addition, we develop a new
hybrid multiobjective evolutionary approach that includes the features
of a variable length representation that allow for easy adaptation of
neural networks structures, an architectural recombination procedure
based on the geometrical measure that adapts the number of necessary
hidden neurons and facilitates the exchange of neuronal information
between candidate designs, and a microhybrid genetic algorithm (mu
HGA) with an adaptive local search intensity scheme for local
fine-tuning. In addition, the performances of well-known algorithms as
well as the effectiveness and contributions of the proposed approach
are analyzed and validated through a variety of data set types.

2008


* 374(<-621): A neural network approach to multiobjective and multilevel programming problems

This study aims at utilizing the dynamic behavior of artificial neural networks (ANNs) to solve multiobjective programming (MOP) and multilevel programming (MLP) problems. The traditional and nontraditional approaches to the MLP are first classified into five categories. Then, based on the approach proposed by Hopfield and Tank [1], the optimization problem is converted into a system of nonlinear differential equations through the use of an energy function and Lagrange multipliers. Finally, the procedure is extended to MOP and MLP problems. To solve the resulting differential equations, a steepest descent search technique is used. This proposed nontraditional algorithm is efficient for solving complex problems, and is especially useful for implementation on a large-scale VLSI, in which the MOP and MLP problems can be solved on a real time basis. To illustrate the approach, several numerical examples are solved and compared. (C) 2004 Elsevier Ltd. All rights reserved.

2004


* 377(<-513): Soft computing in engineering design - A review

The present paper surveys the application of soft computing (SC)
techniques in engineering design. Within this context, fuzzy logic
(FL), genetic algorithms (GA) and artificial neural networks (ANN), as
well as their fusion are reviewed in order to examine the capability
of soft computing methods and techniques to effectively address
various hard-to-solve design tasks and issues. Both these tasks and
issues are studied in the first part of the paper accompanied by
references to some results extracted from a survey performed for in
some industrial enterprises. The second part of the paper makes an
extensive review of the literature regarding the application of soft
computing (SC) techniques in engineering design. Although this review
cannot be collectively exhaustive, it may be considered as a valuable
guide for researchers who are interested in the domain of engineering
design and wish to explore the opportunities offered by fuzzy logic,
artificial neural networks and genetic algorithms for further
improvement of both the design outcome and the design process
itself. An arithmetic method is used in order to evaluate the review
results, to locate the research areas where SC has already given
considerable results and to reveal new research opportunities. (C)
2007 Elsevier Ltd. All rights reserved.

2008


* 379(<-274): Computational algorithms inspired by biological processes and evolution

In recent times computational algorithms inspired by biological
processes and evolution are gaining much popularity for solving
science and engineering problems. These algorithms are broadly
classified into evolutionary computation and swarm intelligence
algorithms, which are derived based on the analogy of natural
evolution and biological activities. These include genetic algorithms,
genetic programming, differential evolution, particle swarm
optimization, ant colony optimization, artificial neural networks,
etc. The algorithms being random-search techniques, use some
heuristics to guide the search towards optimal solution and speed-up
the convergence to obtain the global optimal solutions. The
bio-inspired methods have several attractive features and advantages
compared to conventional optimization solvers. They also facilitate
the advantage of simulation and optimization environment
simultaneously to solve hard-to-define (in simple expressions),
real-world problems. These biologically inspired methods have provided
novel ways of problem-solving for practical problems in traffic
routing, networking, games, industry, robotics, economics, mechanical,
chemical, electrical, civil, water resources and others fields. This
article discusses the key features and development of bio-inspired
computational algorithms, and their scope for application in science
and engineering fields.

2012


* 386(<-683): FUZZY THRESHOLD FUNCTIONS AND APPLICATIONS

The set of fuzzy threshold functions is defined to be a fuzzy set over
the set of functions. All threshold functions have full memberships in
this fuzzy set. Defines and investigates a distance measure between a
non-linearly separable function and the set of all threshold
functions. Defines an explicit expression for the membership function
of a fuzzy threshold function through the use of this distance measure
and finds three upper bounds for this measure. Presents a general
method to compute the distance, an algorithm to generate the
representation automatically, and a procedure to determine the proper
weights and thresholds automatically. Presents the relationships among
threshold gate networks, artificial neural networks and fuzzy neural
networks. The results may have useful applications in logic design,
pattern recognition, fuzzy logic, multi-objective fuzzy optimization
and related areas.

1995


* 391(<-579): Nonessential objectives within network approaches for MCDM

In Gal and Hanne [Eur. J. Oper. Res. 119 (1999) 373] the problem of
using several methods to solve a multiple criteria decision making
(MCDM) problem with linear objective functions after dropping
nonessential objectives is analyzed. It turned out that the solution
does not need be the same when using various methods for solving the
system containing the nonessential objectives or not. In this paper we
consider the application of network approaches for multicriteria
decision making such as neural networks and an approach for combining
MCDM methods (called MCDM networks). We discuss questions of comparing
the results obtained with several methods as applied to the problem
with or without nonessential objectives. Especially, we argue for
considering redundancies such as nonessential objectives as a native
feature in complex information processing. In contrast to previous
results on nonessential objectives, the current paper focuses on
discrete MCDM problems which are also denoted as multiple attribute
decision making (MADM). (c) 2004 Elsevier B.V. All rights reserved.

2006


* 392(<-666): Clustering and selection of multiple criteria alternatives using unsupervised and supervised neural networks

There are decision-making problems that involve grouping and selecting
a set of alternatives. Traditional decision-making approaches treat
different sets of alternatives with the same method of analysis and
selection. In this paper, we propose clustering alternatives into
different sets so that different methods of analysis, selection, and
implementation for each set can be applied. We consider multiple
criteria decision-making alternatives where the decision-maker is
faced with several conflicting and non-commensurate objectives (or
criteria). For example, consider buying a set of computers for a
company that vary in terms of their functions, prices, and computing
powers. In this paper, we develop theories and procedures for
clustering and selecting discrete multiple criteria alternatives. The
sets of alternatives clustered are mutually exclusive and are based on
(1) similar features among alternatives, and (2) preferential
structure of the decision-maker. The decision-making process can be
broken down into three steps: (1) generating alternatives; (2)
grouping or clustering alternatives based on similarity of their
features; and (3) choosing one or more alternatives from each cluster
of alternatives. We utilize unsupervised learning clustering
artificial neural networks (ANN) with variable weights for clustering
of alternatives, and we use feedforward ANN for the selection of the
best alternatives for each cluster of alternatives. The decision-maker
is interactively involved by comparing and contrasting alternatives
within each group so that the best alternative can be selected from
each group. For the learning mechanism of ANN, we proposed using a
generalized Euclidean distance where by changing its coefficients new
formation of clusters of alternatives can be achieved. The algorithm
is interactive and the results are independent of the initial set-up
information. Some examples and computational results are presented.

2000


* 395(<-215): A novel artificial immune clonal selection classification and rule mining with swarm learning model

Metaheuristic optimisation algorithms have become popular choice for
solving complex problems. By integrating Artificial Immune clonal
selection algorithm (CSA) and particle swarm optimisation (PSO)
algorithm, a novel hybrid Clonal Selection Classification and Rule
Mining with Swarm Learning Algorithm (CS2) is proposed. The main goal
of the approach is to exploit and explore the parallel computation
merit of Clonal Selection and the speed and self-organisation merits
of Particle Swarm by sharing information between clonal selection
population and particle swarm. Hence, we employed the advantages of
PSO to improve the mutation mechanism of the artificial immune CSA and
to mine classification rules within datasets. Consequently, our
proposed algorithm required less training time and memory cells in
comparison to other AIS algorithms. In this paper, classification rule
mining has been modelled as a miltiobjective optimisation problem with
predictive accuracy. The multiobjective approach is intended to allow
the PSO algorithm to return an approximation to the accuracy and
comprehensibility border, containing solutions that are spread across
the border. We compared our proposed algorithm classification accuracy
CS2 with five commonly used CSAs, namely: AIRS1, AIRS2, AIRS-Parallel,
CLONALG, and CSCA using eight benchmark datasets. We also compared our
proposed algorithm classification accuracy CS2 with other five
methods, namely: Naive Bayes, SVM, MLP, CART, and RFB. The results
show that the proposed algorithm is comparable to the 10 studied
algorithms. As a result, the hybridisation, built of CSA and PSO, can
develop respective merit, compensate opponent defect, and make
search-optimal effect and speed better.

2013


* 433(<-670): Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves

It is well understood that binary classifiers have two implicit
objective functions (sensitivity and specificity) describing their
performance. Traditional methods of classifier training attempt to
combine these two objective functions (or two analogous class
performance measures) into one so that conventional scalar
optimization techniques can be utilized. This involves incorporating a
priori information into the aggregation method so that the resulting
performance of the classifier is satisfactory for the task at hand. We
have investigated the use of a niched Pareto multiobjective genetic
algorithm (GA) for classifier optimization. With niched Pareto GA's,
an objective vector is optimized instead of a scalar function,
eliminating the need to aggregate classification objective
functions. The niched Pareto GA returns a set of optimal solutions
that are equivalent in the absence of any information regarding the
preferences of the objectives. The a priori knowledge that was used
for aggregating the objective functions in conventional classifier
training can instead be applied post-optimization to select from one
of the series of solutions returned from the multiobjective genetic
optimization. We have applied this technique to train a linear
classifier and an artificial neural network (ANN), using simulated
datasets, The performances of the solutions returned from the
multiobjective genetic optimization represent a series of optimal
(sensitivity, specificity) pairs, which can be thought of as operating
points on a receiver operating characteristic (ROC) curve. All
possible ROC curves for a given dataset and classifier are less than
or equal to the ROC curve generated by the niched Pareto genetic
optimization.

1999


* 438(<-226): Algorithm for Increasing the Speed of Evolutionary Optimization and its Accuracy in Multi-objective Problems

Optimization algorithms are important tools for the solution of
combinatorial management problems. Nowadays, many of those problems
are addressed by using evolutionary algorithms (EAs) that move toward
a near-optimal solution by repetitive simulations. Sometimes, such
extensive simulations are not possible or are costly and
time-consuming. Thus, in this study a method based on artificial
neural networks (ANN) is proposed to reduce the number of simulations
required in EAs. Specifically, an ANN simulator is used to reduce the
number of simulations by the main simulator. The ANN is trained and
updated only for required areas in the decision space. Performance of
the proposed method is examined by integrating it with the
non-dominated sorting genetic algorithm (NSGAII) in multi-objective
problems. In terms of density and optimality of the Pareto front, the
hybrid NSGAII-ANN is able to extract the Pareto front with much less
simulation time compared to the sole use of the NSGAII algorithm. The
proposed NSGAII-ANN methodology was examined using three standard test
problems (FON, KUR, and ZDT1) and one real-world problem. The latter
addresses the operation of a reservoir with two objectives (meeting
demand and flood control). Thus, based on this study, use of the
NSGAII-ANN integrative algorithm in problems with time-consuming
simulators reduces the required time for optimization up to 50
times. Results of the real-world problem, despite lower
computational-time requirements, show a performance similar to that
achieved in the aforementioned test problems.

2013


* 476(<-647): Software verification of redundancy in neuro-evolutionary robotics

Evolutionary methods are now commonly used to automatically generate
autonomous controllers for physical robots as well as for virtually
embodied organisms. Although it is generally accepted that some amount
of redundancy may result from using an evolutionary approach, few
studies have focused on empirically testing the actual amount of
redundancy that is present in controllers generated using artificial
evolutionary systems. Network redundancy in certain application
domains such as defence, space, and safeguarding, is unacceptable as
it puts the reliability of the system at great risk. Thus, our aim in
this paper is to test and compare the redundancies of artificial
neural network (ANN) controllers that are evolved for a quadrupedal
robot using four different evolutionary methodologies. Our results
showed that the least amount of redundancy was generated using a
self-adaptive Pareto evolutionary multi-objective optimization (EMO)
algorithm compared to the more commonly used single-objective
evolutionary algorithm (EA) and weighted sum EMO algorithm. Finally,
self-adaptation was found to be highly beneficial in reducing
redundancy when compared against a hand-tuned Pareto EMO algorithm.

2003


* 502(<-235): COMBINING EVOLUTION STRATEGY WITH ORDINAL OPTIMIZATION

In this paper, we combine evolution strategy (ES) with ordinal
optimization (OO), abbreviated as ES+OO, to solve real-time
combinatorial stochastic simulation optimization problems with huge
discrete solution space. The first step of ES+OO is to use an
artificial neural network (ANN) to construct a surrogate model to
roughly evaluate the objective value of a solution. In the second
step, we apply ES assisted by the ANN-based surrogate model to the
considered problem to obtain a subset of good enough solutions. In the
last step, we use the exact model to evaluate each solution in the
good enough subset, and the best one is the final good enough
solution. We apply the proposed algorithm to a wafer testing problem,
which is formulated as a combinatorial stochastic simulation
optimization problem that consists of a huge discrete solution space
formed by the vector of threshold values in the testing process. We
demonstrate that (a) ES+OO outperforms the combination of genetic
algorithm (GA) with OO using extensive simulations in the wafer
testing problem, and its computational efficiency is suitable for
real-time application, (b) the merit of using OO approach in solving
the considered problem and (c) ES+OO can obtain the approximate Pareto
optimal solution of the multi-objective function resided in the
considered problem. Above all, we propose a systematic procedure to
evaluate the performance of ES+OO by providing a quantitative result.

2013


* 505(<-678): Artificial neural network representations for hierarchical preference structures

In this paper, we introduce two artificial neural network formulations
that can be used to assess the preference ratings from the pairwise
comparison matrices of the Analytic Hierarchy Process. First, we
introduce a modified Hopfield network that can determine the vector of
preference ratings associated with a positive reciprocal comparison
matrix. The dynamics of this network are mathematically equivalent to
the power method, a widely used numerical method for computing the
principal eigenvectors of square matrices. However, this Hopfield
network representation is incapable of generalizing the preference
patterns, and consequently is not suitable for approximating the
preference ratings if the pairwise comparison judgments are
imprecise. Second, we present a feed-forward neural network
formulation that does have the ability to accurately approximate the
preference ratings. We use a simulation experiment to verify the
robustness of the feed-forward neural network formulation with respect
to imprecise pairwise judgments. From the results of this experiment,
we conclude that the feed-forward neural network formulation appears
to be a powerful tool for analyzing discrete alternative multicriteria
decision problems with imprecise or fuzzy ratio-scale preference
judgments. Copyright (C) 1996 Elsevier Science Ltd

1996


* 519(<-366): Multi-objective memetic algorithm: comparing artificial neural networks and pattern search filter method approaches

In this work, two methodologies to reduce the computation time of
expensive multi-objective optimization problems are compared. These
methodologies consist of the hybridization of a multi-objective
evolutionary algorithm (MOEA) with local search procedures. First, an
inverse artificial neural network proposed previously, consisting of
mapping the decision variables into the multiple objectives to be
optimized in order to generate improved solutions on certain
generations of the MOEA, is presented. Second, a new approach based on
a pattern search filter method is proposed in order to perform a local
search around certain solutions selected previously from the Pareto
frontier. The results obtained, by the application of both
methodologies to difficult test problems, indicate a good performance
of the approaches proposed.

2011


* 527(<-270): Structure optimization of neural network for dynamic system modeling using multi-objective genetic algorithm

The problem of constructing an adequate and parsimonious neural
network topology for modeling non-linear dynamic system is studied and
investigated. Neural networks have been shown to perform function
approximation and represent dynamic systems. The network structures
are usually guessed or selected in accordance with the designer's
prior knowledge. However, the multiplicity of the model parameters
makes it troublesome to get an optimum structure. In this paper, an
alternative algorithm based on a multi-objective optimization
algorithm is proposed. The developed neural network model should
fulfil two criteria or objectives namely good predictive accuracy and
minimum model structure. The result shows that the proposed algorithm
is able to identify simulated examples correctly, and identifies the
adequate model for real process data based on a set of solutions
called the Pareto optimal set, from which the best network can be
selected.

2012


* 645(<-524): Multi-criteria optimization in nonlinear predictive control

The multi-criteria predictive control of nonlinear dynamical systems
based on Artificial Neural Networks (ANNs) and genetic algorithms
(GAs) are considered. The (ANNs) are used to determine process models
at each operating level; the control action is provided by minimizing
a set of control objective which is function of the future prediction
output and the future control actions in tacking in account
constraints in input signal. An aggregative method based on the
Non-dominated Sorting Genetic Algorithm (NSGA) is applied to solve the
multi-criteria optimization problem. The results obtained with the
proposed control scheme are compared in simulation to those obtained
with the multi-model control approach. (c) 2007 IMACS. Published by
Elsevier B.V. All rights reserved.

2008


* 653(<-664): An artificial neural network approach to multicriteria model selection

This paper presents an intelligent decision support system based on
neural network technology for multicriteria model selection. This
paper categorizes the problem as simple, utility / value, interactive
and outranking type of problem according to six basic features. The
classification of the problem is realized based on a two-step neural
network analysis applying back-propagation algorithm. The first
Artificial Neural Network (ANN) model that is used for the selection
of an appropriate solving method cluster consists of one hidden
layer. The six input neurons of the model represent the MCDM problem
features while the two output neurons represent the four MCDM
categories. The second ANN model is used for the selection of a
specific method within the selected cluster.

2001