-*- mode: org -*-

Shortlisting most relevant articles found while examining the WoK dump:
(No time to read all as of now; must select a most interesting subset)
* MORE interesting ones, perhaps:


* OTHERs that turned up:

*   3(<-596): MOP/GP models for machine learning

Techniques for machine learning have been extensively studied in
recent years as effective tools in data mining. Although there have
been several approaches to machine learning, we focus on the
mathematical programming (in particular, multi-objective and goal
programming; MOP/GP) approaches in this paper. Among them, Support
Vector Machine (SVM) is gaining much popularity recently. In pattern
classification problems with two class sets, its idea is to find a
maximal margin separating hyperplane which gives the greatest
separation between the classes in a high dimensional feature
space. This task is performed by solving a quadratic programming
problem in a traditional formulation, and can be reduced to solving a
linear programming in another formulation. However, the idea of
maximal margin separation is not quite new: in the 1960s the
multi-surface method (MSM) was suggested by Mangasarian. In the 1980s,
linear classifiers using goal programming were developed
extensively. This paper presents an overview on how effectively MOP/GP
techniques can be applied to machine learning such as SVM, and
discusses their problems. (c) 2004 Elsevier B.V. All rights reserved.

2005

*  13(<- 90): A hybrid meta-learning architecture for multi-objective optimization of SVM parameters

Support Vector Machines (SVMs) have achieved a considerable attention
due to their theoretical foundations and good empirical performance
when compared to other learning algorithms in different
applications. However, the SVM performance strongly depends on the
adequate calibration of its parameters. In this work we proposed a
hybrid multi-objective architecture which combines meta-learning (ML)
with multi-objective particle swarm optimization algorithms for the
SVM parameter selection problem. Given an input problem, the proposed
architecture uses a ML technique to suggest an initial Pareto front of
SVM configurations based on previous similar learning problems; the
suggested Pareto front is then refined by a multi-objective
optimization algorithm. In this combination, solutions provided by ML
are possibly located in good regions in the search space. Hence, using
a reduced number of successful candidates, the search process would
converge faster and be less expensive. In the performed experiments,
the proposed solution was compared to traditional multi-objective
algorithms with random initialization, obtaining Pareto fronts with
higher quality on a set of 100 classification problems. (C) 2014
Elsevier B.V. All rights reserved.

2014

*  32(<-120): A niching genetic programming-based multi-objective algorithm for hybrid data classification

This paper introduces a multi-objective algorithm based on genetic
programming to extract classification rules in databases composed of
hybrid data, i.e., regular (e.g. numerical, logical, and textual) and
non-regular (e.g. geographical) attributes. This algorithm employs a
niche technique combined with a population archive in order to
identify the rules that are more suitable for classifying items
amongst classes of a given data set. The algorithm is implemented in
such a way that the user can choose the function set that is more
adequate for a given application. This feature makes the proposed
approach virtually applicable to any kind of data set classification
problem. Besides, the classification problem is modeled as a
multi-objective one, in which the maximization of the accuracy and the
minimization of the classifier complexity are considered as the
objective functions. A set of different classification problems, with
considerably different data sets and domains, has been considered:
wines, patients with hepatitis, incipient faults in power transformers
and level of development of cities. In this last data set, some of the
attributes are geographical, and they are expressed as points, lines
or polygons. The effectiveness of the algorithm has been compared with
three other methods, widely employed for classification: Decision Tree
(C4.5), Support Vector Machine (SVM) and Radial Basis Function
(RBF). Statistical comparisons have been conducted employing one-way
ANOVA and Tukey's tests, in order to provide reliable comparison of
the methods. The results show that the proposed algorithm achieved
better classification effectiveness in all tested instances, what
suggests that it is suitable for a considerable range of
classification applications. (C) 2014 Elsevier B.V. All rights
reserved.

2014

*  82(<-330): Integrating multicriteria PROMETHEE II method into a single-layer perceptron for two-class pattern classification

PROMETHEE methods based on the outranking relation theory are
extensively used in multicriteria decision aid. A preference index
representing the intensity of preference for one pattern over another
pattern can be measured by various preference functions. The higher
the intensity, the stronger the preference is indicated. In contrast
to traditional single-layer perceptrons (SLPs) with the sigmoid
function, this paper develops a novel PROMETHEE II-based SLP using
concepts from the PROMETHEE II method involving pairwise comparisons
between patterns. The assignment of a class label to a pattern is
dependent on its net preference index, which the proposed perceptron
obtains. Specially, this study designs a genetic-algorithm-based
learning algorithm to determine the relative weights of respective
criteria in order to derive the preference index for any pair of
patterns. Computer simulations involving several real-world data sets
reveal the classification performance of the proposed PROMETHEE
II-based SLP. The proposed perceptron performs well compared to the
other well-known fuzzy or non-fuzzy classification methods.

2011

* 164(<-306): Memetic algorithms and memetic computing optimization: A literature review

Memetic computing is a subject in computer science which considers
complex structures such as the combination of simple agents and memes,
whose evolutionary interactions lead to intelligent complexes capable
of problem-solving. The founding cornerstone of this subject has been
the concept of memetic algorithms, that is a class of optimization
algorithms whose structure is characterized by an evolutionary
framework and a list of local search components. This article presents
a broad literature review on this subject focused on optimization
problems. Several classes of optimization problems, such as discrete,
continuous, constrained, multi-objective and characterized by
uncertainties, are addressed by indicating the memetic "recipes"
proposed in the literature. In addition, this article focuses on
implementation aspects and especially the coordination of memes which
is the most important and characterizing aspect of a memetic
structure. Finally, some considerations about future trends in the
subject are given. (C) 2011 Elsevier B.V. All rights reserved.

2012

* 244(<-548): Improving generalization of MLPs with sliding mode control and the Levenberg-Marquardt algorithm

A variation of the well-known Levenberg-Marquardt for training neural
networks is proposed in this work. The algorithm presented restricts
the norm of the weights vector to a preestablished norm value and
finds the minimum error solution for that norm value. The norm
constrain controls the neural networks degree of freedom. The more the
norm increases, the more flexible is the neural model. Therefore, more
fitted to the training set. A range of different norm solutions is
generated and the best generalization solution is selected according
to the validation set error. The results show the efficiency of the
algorithm in terms of generalization performance. (c) 2006 Elsevier
B.V. All rights reserved.

2007

* 316(<- 52): Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data

In industrial process control, there may be multiple performance
objectives, depending on salient features of the input-output
data. Aiming at this situation, this paper proposes multiple
actor-critic structures to obtain the optimal control via input-output
data for unknown nonlinear systems. The shunting inhibitory artificial
neural network (SIANN) is used to classify the input-output data into
one of several categories. Different performance measure functions may
be defined for disparate categories. The approximate dynamic
programming algorithm, which contains model module, critic network,
and action network, is used to establish the optimal control in each
category. A recurrent neural network (RNN) model is used to
reconstruct the unknown system dynamics using input-output data. NNs
are used to approximate the critic and action networks,
respectively. It is proven that the model error and the closed unknown
system are uniformly ultimately bounded. Simulation results
demonstrate the performance of the proposed optimal control scheme for
the unknown nonlinear system.

2015

* 326(<-687): USING GENETIC ALGORITHMS FOR AN ARTIFICIAL NEURAL-NETWORK MODEL INVERSION

Genetic algorithms (GAs) and artificial neural networks (ANNs) are
techniques for optimization and learning, respectively, which both
have been adopted from nature. Their main advantage over traditional
techniques is the relatively better performance when applied to
complex relations. GAs and ANNs are both self-learning systems, i.e.,
they do not require any background knowledge from the creator. In this
paper, we describe the performance of a GA that finds hypothetical
physical structures of poly(ethylene terephthalate) (PET) yarns
corresponding to a certain combination of mechanical and shrinkage
properties. This GA uses a validated ANN that has been trained for the
complex relation between structure and properties of PET. This
technique was tested by comparing the optimal points found by the GA
with known experimental data under a variety of multi-criteria
conditions.

1993

* 433(<-670): Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves

It is well understood that binary classifiers have two implicit
objective functions (sensitivity and specificity) describing their
performance. Traditional methods of classifier training attempt to
combine these two objective functions (or two analogous class
performance measures) into one so that conventional scalar
optimization techniques can be utilized. This involves incorporating a
priori information into the aggregation method so that the resulting
performance of the classifier is satisfactory for the task at hand. We
have investigated the use of a niched Pareto multiobjective genetic
algorithm (GA) for classifier optimization. With niched Pareto GA's,
an objective vector is optimized instead of a scalar function,
eliminating the need to aggregate classification objective
functions. The niched Pareto GA returns a set of optimal solutions
that are equivalent in the absence of any information regarding the
preferences of the objectives. The a priori knowledge that was used
for aggregating the objective functions in conventional classifier
training can instead be applied post-optimization to select from one
of the series of solutions returned from the multiobjective genetic
optimization. We have applied this technique to train a linear
classifier and an artificial neural network (ANN), using simulated
datasets, The performances of the solutions returned from the
multiobjective genetic optimization represent a series of optimal
(sensitivity, specificity) pairs, which can be thought of as operating
points on a receiver operating characteristic (ROC) curve. All
possible ROC curves for a given dataset and classifier are less than
or equal to the ROC curve generated by the niched Pareto genetic
optimization.

1999


* FETCHED, not yet read:

*  37(<- 64): Surrogate-assisted multi-objective model selection for support vector machines

Classification is one of the most well-known tasks in supervised
learning. A vast number of algorithms for pattern classification have
been proposed so far. Among these, support vector machines (SVMs) are
one of the most popular approaches, due to the high performance
reached by these methods in a wide number of pattern recognition
applications. Nevertheless, the effectiveness of SVMs highly depends
on their hyper-parameters. Besides the fine-tuning of their
hyper-parameters, the way in which the features are scaled as well as
the presence of non-relevant features could affect their
generalization performance. This paper introduces an approach for
addressing model selection for support vector machines used in
classification tasks. In our formulation, a model can be composed of
feature selection and pre-processing methods besides the SVM
classifier. We formulate the model selection problem as a
multi-objective one, aiming to minimize simultaneously two components
that are closely related to the error of a model: bias and variance
components, which are estimated in an experimental fashion. A
surrogate-assisted evolutionary multi-objective optimization approach
is adopted to explore the hyper-parameters space. We adopted this
approach due to the fact that estimating the bias and variance could
be computationally expensive. Therefore, by using surrogate-assisted
optimization, we expect to reduce the number of solutions evaluated by
the fitness functions so that the computational cost would also be
reduced. Experimental results conducted on benchmark datasets widely
used in the literature, indicate that highly competitive models with a
fewer number of fitness function evaluations are obtained by our
proposal when it is compared to state of the art model selection
methods. (C) 2014 Elsevier B.V. All rights reserved.

2015

*  42(<-589): Multiobjective optimization of ensembles of multilayer perceptrons for pattern classification

Pattern classification seeks to minimize error of unknown patterns,
however, in many real world applications, type I (false positive) and
type II (false negative) errors have to be dealt with separately,
which is a complex problem since an attempt to minimize one of them
usually makes the other grow. Actually, a type of error can be more
important than the other, and a trade-off that minimizes the most
important error type must be reached. Despite the importance of
type-II errors, most pattern classification methods take into account
only the global classification error. In this paper we propose to
optimize both error types in classification by means of a
multiobjective algorithm in which each error type and the network size
is an objective of the fitness function. A modified version of the
GProp method (optimization and design of multilayer perceptrons) is
used, to simultaneously optimize the network size and the type I and
II errors.

2006

*  44(<-425): A multi-model selection framework for unknown and/or evolutive misclassification cost problems

In this paper, we tackle the problem of model selection when
misclassification costs are unknown and/or may evolve. Unlike
traditional approaches based on a scalar optimization, we propose a
generic multimodel selection framework based on a multi-objective
approach. The idea is to automatically train a pool of classifiers
instead of one single classifier, each classifier in the pool
optimizing a particular trade-off between the objectives. Within the
context of two-class classification problems, we introduce the "ROC
front concept" as an alternative to the ROC curve representation. This
strategy is applied to the multimodel selection of SVM classifiers
using an evolutionary multi-objective optimization algorithm. The
comparison with a traditional scalar optimization technique based on
an AUC criterion shows promising results on UCl datasets as well as on
a real-world classification problem. (C) 2009 Elsevier Ltd. All rights
reserved.

2010

*  54(<-127): SVM classification for imbalanced data sets using a multiobjective optimization framework

Classification of imbalanced data sets in which negative instances
outnumber the positive instances is a significant challenge. These
data sets are commonly encountered in real-life problems. However,
performance of well-known classifiers is limited in such
cases. Various solution approaches have been proposed for the class
imbalance problem using either data-level or algorithm-level
modifications. Support Vector Machines (SVMs) that have a solid
theoretical background also encounter a dramatic decrease in
performance when the data distribution is imbalanced. In this study,
we propose an L-1-norm SVM approach that is based on a three objective
optimization problem so as to incorporate into the formulation the
error sums for the two classes independently. Motivated by the
inherent multi objective nature of the SVMs, the solution approach
utilizes a reduction into two criteria formulations and investigates
the efficient frontier systematically. The results indicate that a
comprehensive treatment of distinct positive and negative error levels
may lead to performance improvements that have varying degrees of
increased computational effort.

2014

*  77(<- 79): Pareto-Path Multitask Multiple Kernel Learning

A traditional and intuitively appealing Multitask Multiple Kernel
Learning (MT-MKL) method is to optimize the sum (thus, the average) of
objective functions with (partially) shared kernel function, which
allows information sharing among the tasks. We point out that the
obtained solution corresponds to a single point on the Pareto Front
(PF) of a multiobjective optimization problem, which considers the
concurrent optimization of all task objectives involved in the
Multitask Learning (MTL) problem. Motivated by this last observation
and arguing that the former approach is heuristic, we propose a novel
support vector machine MT-MKL framework that considers an implicitly
defined set of conic combinations of task objectives. We show that
solving our framework produces solutions along a path on the
aforementioned PF and that it subsumes the optimization of the average
of objective functions as a special case. Using the algorithms we
derived, we demonstrate through a series of experimental results that
the framework is capable of achieving a better classification
performance, when compared with other similar MTL approaches.

2015

* 137(<-275): A two-stage evolutionary algorithm based on sensitivity and accuracy for multi-class problems

The machine learning community has traditionally used correct
classification rates or accuracy (C) values to measure classifier
performance and has generally avoided presenting classification levels
of each class in the results, especially for problems with more than
two classes. C values alone are insufficient because they cannot
capture the myriad of contributing factors that differentiate the
performance of two different classifiers. Receiver Operating
Characteristic (ROC) analysis is an alternative to solve these
difficulties, but it can only be used for two-class problems. For this
reason, this paper proposes a new approach for analysing classifiers
based on two measures: C and sensitivity (S) (i.e., the minimum of
accuracies obtained for each class). These measures are optimised
through a two-stage evolutionary process. It was conducted by applying
two sequential fitness functions in the evolutionary process,
including entropy (E) for the first stage and a new fitness function,
area (A), for the second stage. By using these fitness functions, the
C level was optimised in the first stage, and the S value of the
classifier was generally improved without significantly reducing C in
the second stage. This two-stage approach improved S values in the
generalisation set (whereas an evolutionary algorithm (EA) based only
on the S measure obtains worse S levels) and obtained both high C
values and good classification levels for each class. The methodology
was applied to solve 16 benchmark classification problems and two
complex real-world problems in analytical chemistry and predictive
microbiology. It obtained promising results when compared to other
competitive multiclass classification algorithms and a multi-objective
alternative based on E and S. (C) 2012 Elsevier Inc. All rights
reserved.

2012

* 211(<-324): Mobility Timing for Agent Communities, a Cue for Advanced Connectionist Systems

We introduce a wait-and-chase scheme that models the contact times
between moving agents within a connectionist construct. The idea that
elementary processors move within a network to get a proper position
is borne out both by biological neurons in the brain morphogenesis and
by agents within social networks. From the former, we take inspiration
to devise a medium-term project for new artificial neural network
training procedures where mobile neurons exchange data only when they
are close to one another in a proper space (are in contact). From the
latter, we accumulate mobility tracks experience. We focus on the
preliminary step of characterizing the elapsed time between neuron
contacts, which results from a spatial process fitting in the family
of random processes with memory, where chasing neurons are
stochastically driven by the goal of hitting target neurons. Thus, we
add an unprecedented mobility model to the literature in the field,
introducing a distribution law of the intercontact times that merges
features of both negative exponential and Pareto distribution laws. We
give a constructive description and implementation of our model, as
well as a short analytical form whose parameters are suitably
estimated in terms of confidence intervals from experimental
data. Numerical experiments show the model and related inference tools
to be sufficiently robust to cope with two main requisites for its
exploitation in a neural network: the nonindependence of the observed
intercontact times and the feasibility of the model inversion problem
to infer suitable mobility parameters.

2011

* 241(<-405): Neural network ensembles: immune-inspired approaches to the diversity of components

This work applies two immune-inspired algorithms, namely opt-aiNet and
omni-aiNet, to train multi-layer perceptrons (MLPs) to be used in the
construction of ensembles of classifiers. The main goal is to
investigate the influence of the diversity of the set of solutions
generated by each of these algorithms, and if these solutions lead to
improvements in performance when combined in ensembles. omni-aiNet is
a multi-objective optimization algorithm and, thus, explicitly
maximizes the components' diversity at the same time it minimizes
their output errors. The opt-aiNet algorithm, by contrast, was
originally designed to solve single-objective optimization problems,
focusing on the minimization of the output error of the
classifiers. However, an implicit diversity maintenance mechanism
stimulates the generation of MLPs with different weights, which may
result in diverse classifiers. The performances of opt-aiNet and
omni-aiNet are compared with each other and with that of a
second-order gradient-based algorithm, named MSCG. The results
obtained show how the different diversity maintenance mechanisms
presented by each algorithm influence the gain in performance obtained
with the use of ensembles.

2010

* 247(<-645): Training neural networks with a multi-objective sliding mode control algorithm

This paper presents a new sliding mode control algorithm that is able
to guide the trajectory of a multi-layer perceptron within the plane
formed by the two objective functions: training set error and norm of
the weight vectors. The results show that the neural networks obtained
are able to generate an approximation to the Pareto set, from which an
improved generalization performance model is selected. (C) 2002
Elsevier Science B.V. All rights reserved.

2003

* 251(<- 85): Time series forecasting by neural networks: A knee point-based multiobjective evolutionary algorithm approach

In this paper, we investigate the problem of time series forecasting
using single hidden layer feedforward neural networks (SLFNs), which
is optimized via multiobjective evolutionary algorithms. By utilizing
the adaptive differential evolution (JADE) and the knee point
strategy, a nondominated sorting adaptive differential evolution
(NSJADE) and its improved version knee point-based NSJADE (KP-NSJADE)
are developed for optimizing SLFNs. JADE aiming at refining the search
area is introduced in nondominated sorting genetic algorithm II
(NSGA-II). The presented NSJADE shows superiority on multimodal
problems when compared with NSGA-II. Then NSJADE is applied to train
SLFNs for time series forecasting. It is revealed that individuals
with better forecasting performance in the whole population gather
around the knee point. Therefore, KP-NSJADE is proposed to explore the
neighborhood of the knee point in the objective space. And the
simulation results of eight popular time series databases illustrate
the effectiveness of our proposed algorithm in comparison with several
popular algorithms. (C) 2014 Elsevier Ltd. All rights reserved.

2014

* 252(<-124): An analysis of accuracy-diversity trade-off for hybrid combined system with multiobjective predictor selection

This study examines the contribution of diversity under a
multi-objective context for the promotion of learners in an
evolutionary system that generates combinations of partially trained
learners. The examined system uses a grammar-driven genetic
programming to evolve hierarchical, multi-component combinations of
multilayer perceptrons and support vector machines for regression. Two
advances are studied. First, a ranking formula is developed for the
selection probability of the base learners. This formula incorporates
both a diversity measure and the performance of learners, and it is
tried over a series of artificial and real-world problems. Results
show that when the diversity of a learner is incorporated with equal
weights to the learner performance in the evolutionary selection
process, the system is able to provide statistically significantly
better generalization. The second advance examined is a substitution
phase for learners that are over-dominated, under a multi-objective
Pareto domination assessment scheme. Results here show that the
substitution does not improve significantly the system performance,
thus the exclusion of very weak learners, is not a compelling task for
the examined framework.

2014

* 263(<-204): Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems

This paper presents a new multiobjective evolutionary algorithm
applied to a radial basis function (RBF) network design based on
multiobjective particle swarm optimization augmented with local search
features. The algorithm is named the memetic multiobjective particle
swarm optimization RBF network (MPSON) because it integrates the
accuracy and structure of an RBF network. The proposed algorithm is
implemented on two-class and multiclass pattern classification
problems with one complex real problem. The experimental results
indicate that the proposed algorithm is viable, and provides an
effective means to design multiobjective RBF networks with good
generalization capability and compact network structure. The accuracy
and complexity of the network obtained by the proposed algorithm are
compared with the memetic non-dominated sorting genetic algorithm
based RBF network (MGAN) through statistical tests. This study shows
that MPSON generates RBF networks coming with an appropriate balance
between accuracy and simplicity, outperforming the other algorithms
considered. (C) 2013 Elsevier Inc. All rights reserved.

2013

* 265(<-325): Memetic Elitist Pareto Differential Evolution algorithm based Radial Basis Function Networks for classification problems

This paper presents a new multi-objective evolutionary hybrid
algorithm for the design of Radial Basis Function Networks (RBFNs) for
classification problems. The algorithm, MEPDEN, Memetic Elitist Pareto
evolutionary approach based on the Non-dominated Sorting Differential
Evolution (NSDE) multiobjective evolutionary algorithm which has been
adapted to design RBFNs, where the NSDE algorithm is augmented with a
local search that uses the Back-propagation algorithm. The MEPDEN is
tested on two-class and multiclass pattern classification
problems. The results obtained in terms of Mean Square Error (MSE),
number of hidden nodes, accuracy (ACC), sensitivity (SEN), specificity
(SPE) and Area Under the receiver operating characteristics Curve
(AUC), show that the proposed approach is able to produce higher
prediction accuracies with much simpler network structures. The
accuracy and complexity of the network obtained by the proposed
algorithm are compared with Memetic Eilitist Pareto Non-dominated
Sorting Genetic Algorithm based RBFN (MEPGAN) through statistical
tests. This study showed that MEPDEN obtains RBFNs with an appropriate
balance between accuracy and simplicity, outperforming the other
method considered. (C) 2011 Elsevier B.V. All rights reserved.

2011

* 266(<-378): Memetic Pareto Evolutionary Artificial Neural Networks to determine growth/no-growth in predictive microbiology

The main objective of this work is to automatically design neural
network models with sigmoid basis units for binary classification
tasks. The classifiers that are obtained achieve a double objective: a
high classification level in the dataset and a high classification
level for each class. We present MPENSGA2, a Memetic Pareto
Evolutionary approach based on the NSGA2 multiobjective evolutionary
algorithm which has been adapted to design Artificial Neural Network
models, where the NSGA2 algorithm is augmented with a local search
that uses the improved Resilient Backpropagation with
backtracking-IRprop+ algorithm. To analyze the robustness of this
methodology, it was applied to four complex classification problems in
predictive microbiology to describe the growth/no-growth interface of
food-borne microorganisms such as Listeria monocytogenes, Escherichia
coli R31, Staphylococcus aureus and Shigella flexneri. The results
obtained in Correct Classification Rate (CCR), Sensitivity (S) as the
minimum of sensitivities for each class, Area Under the receiver
operating characteristic Curve (AUC), and Root Mean Squared Error
(RMSE), show that the generalization ability and the classification
rate in each class can be more efficiently improved within a
multiobjective framework than within a single-objective framework. (C)
2009 Elsevier B.V. All rights reserved.

2011

* 268(<-414): Sensitivity Versus Accuracy in Multiclass Problems Using Memetic Pareto Evolutionary Neural Networks

This paper proposes a multiclassification algorithm using multilayer
perceptron neural network models. It tries to boost two conflicting
main objectives of multiclassifiers: a high correct classification
rate level and a high classification rate for each class. This last
objective is not usually optimized in classification, but is
considered here given the need to obtain high precision in each class
in real problems. To solve this machine learning problem, we use a
Pareto-based multiobjective optimization methodology based on a
memetic evolutionary algorithm. We consider a memetic Pareto
evolutionary approach based on the NSGA2 evolutionary algorithm
(MPENSGA2). Once the Pareto front is built, two strategies or
automatic individual selection are used: the best model in accuracy
and the best model in sensitivity ( extremes in the Pareto
front). These methodologies are applied to solve 17 classification
benchmark problems obtained from the University of California at
Irvine (UCI) repository and one complex real classification
problem. The models obtained show high accuracy and a high
classification rate for each class.

2010

* 274(<-113): Metrics to guide a multi-objective evolutionary algorithm for ordinal classification

Ordinal classification or ordinal regression is a classification
problem in which the labels have an ordered arrangement between
them. Due to this order, alternative performance evaluation metrics
are need to be used in order to consider the magnitude of errors. This
paper presents a study of the use of a multi-objective optimization
approach in the context of ordinal classification. We contribute a
study of ordinal classification performance metrics, and propose a new
performance metric, the maximum mean absolute error (MMAE). MMAE
considers per-class distribution of patterns and the magnitude of the
errors, both issues being crucial for ordinal regression problems. In
addition, we empirically show that some of the performance metrics are
competitive objectives, which justify the use of multi-objective
optimization strategies. In our case, a multi-objective evolutionary
algorithm optimizes an artificial neural network ordinal model with
different pairs of metric combinations, and we conclude that the pair
of the mean absolute error (MAE) and the proposed MMAE is the most
favourable. A study of the relationship between the metrics of this
proposal is performed, and the graphical representation in the
two-dimensional space where the search of the evolutionary algorithm
takes place is analysed. The results obtained show a good
classification performance, opening new lines of research in the
evaluation and model selection of ordinal classifiers. (C) 2014
Elsevier B.V. All rights reserved.

2014

* 276(<-334): Weighting Efficient Accuracy and Minimum Sensitivity for Evolving Multi-Class Classifiers

Recently, a multi-objective Sensitivity-Accuracy based methodology has
been proposed for building classifiers for multi-class problems. This
technique is especially suitable for imbalanced and multi-class
datasets. Moreover, the high computational cost of multi-objective
approaches is well known so more efficient alternatives must be
explored. This paper presents an efficient alternative to the Pareto
based solution when considering both Minimum Sensitivity and Accuracy
in multi-class classifiers. Alternatives are implemented by extending
the Evolutionary Extreme Learning Machine algorithm for training
artificial neural networks. Experiments were performed to select the
best option after considering alternative proposals and related
methods. Based on the experiments, this methodology is competitive in
Accuracy, Minimum Sensitivity and efficiency.

2011

* 288(<-424): A multi-objective memetic and hybrid methodology for optimizing the parameters and performance of artificial neural networks

The use of artificial neural networks implies considerable time spent
choosing a set of parameters that contribute toward improving the
final performance. Initial weights, the amount of hidden nodes and
layers, training algorithm rates and transfer functions are normally
selected through a manual process of trial-and-error that often fails
to find the best possible set of neural network parameters for a
specific problem. This paper proposes an automatic search methodology
for the optimization of the parameters and performance of neural
networks relying on use of Evolution Strategies, Particle Swarm
Optimization and concepts from Genetic Algorithms corresponding to the
hybrid and global search module. There is also a module that refers to
local searches, including the well-known Multilayer Perceptrons,
Back-propagation and the Levenberg-Marquardt training algorithms. The
methodology proposed here performs the search using the aforementioned
parameters in an attempt to optimize the networks and
performance. Experiments were performed and the results proved the
proposed method to be better than trial-and-error and other methods
found in the literature. Crown Copyright (C) 2009 Published by
Elsevier B.V. All rights reserved.

2010

* 293(<-170): MULTI-OBJECTIVE OPTIMIZATION BY MEANS OF MULTI-DIMENSIONAL MLP NEURAL NETWORKS

In this paper, a multi-layer perceptron (MLP) neural network (NN) is
put forward as an efficient tool for performing two tasks: 1)
optimization of multi-objective problems and 2) solving a non-linear
system of equations. In both cases, mathematical functions which are
continuous and partially bounded are involved. Previously, these two
tasks were performed by recurrent neural networks and also strong
algorithms like evolutionary ones. In this study, multi-dimensional
structure in the output layer of the MLP-NN, as an innovative method,
is utilized to implicitly optimize the multivariate functions under
the network energy optimization mechanism. To this end, the activation
functions in the output layer are replaced with the multivariate
functions intended to be optimized. The effective training parameters
in the global search are surveyed. Also, it is demonstrated that the
MLP-NN with proper dynamic learning rate is able to find globally
optimal solutions. Finally, the efficiency of the MLP-NN in both
aspects of speed and power is investigated by some well-known
experimental examples. In some of these examples, the proposed method
gives explicitly better globally optimal solutions compared to that of
the other references and also shows completely satisfactory results in
other experiments.

2014

* 329(<-271): Convergence analysis of sliding mode trajectories in multi-objective neural networks learning

The Pareto-optimality concept is used in this paper in order to
represent a constrained set of solutions that are able to trade-off
the two main objective functions involved in neural networks
supervised learning: data-set error and network complexity. The neural
network is described as a dynamic system having error and complexity
as its state variables and learning is presented as a process of
controlling a learning trajectory in the resulting state space. In
order to control the trajectories, sliding mode dynamics is imposed to
the network. It is shown that arbitrary learning trajectories can be
achieved by maintaining the sliding mode gains within their
convergence intervals. Formal proofs of convergence conditions are
therefore presented. The concept of trajectory learning presented in
this paper goes further beyond the selection of a final state in the
Pareto set, since it can be reached through different trajectories and
states in the trajectory can be assessed individually against an
additional objective function. (c) 2012 Elsevier Ltd. All rights
reserved.

2012

* 347(<-362): Learning in the feed-forward random neural network: A critical review

The Random Neural Network (RNN) has received, since its inception in
1989, considerable attention and has been successfully used in a
number of applications. In this critical review paper we focus on the
feed-forward RNN model and its ability to solve classification
problems. In particular, we paid special attention to the RNN
literature related with learning algorithms that discover the RNN
interconnection weights, suggested other potential algorithms that can
be used to find the RNN interconnection weights, and compared the RNN
model with other neural-network based and non-neural network based
classifier models. In review, the extensive literature review and
experimentation with the RNN feed-forward model provided us with the
necessary guidance to introduce six critical review comments that
identify some gaps in the RNN's related literature and suggest
directions for future research. (C) 2010 Elsevier B.V. All rights
reserved.

2011

* 353(<-194): Neural Networks Applied in Chemistry. II. Neuro-Evolutionary Techniques in Process Modeling and Optimization

Artificial neural networks are widely used in data analysis and to
control dynamic processes. These tools are powerful and versatile, but
the way in which they are constructed, in particular their
architecture, strongly affects their value and reliability. We review
here some key techniques for optimizing artificial neural networks and
comment on their use in process modeling and
optimization. Neuro-evolutionary techniques are described and
compared, with the goal of providing efficient modeling methodologies
which employ an optimal neural model. We also discuss how neural
networks and evolutionary algorithms can be combined. Applications
from chemical engineering illustrate the effectiveness and reliability
of the hybrid neuro-evolutionary methods.

2013

* 369(<-499): Hybrid multiobjective evolutionary design for artificial neural networks

Evolutionary algorithms are a class of stochastic search methods that
attempts to emulate the biological process of evolution, incorporating
concepts of selection, reproduction, and mutation. In recent years,
there has been an increase in the use of evolutionary approaches in
the training of artificial neural networks (ANNs). While evolutionary
techniques for neural networks have shown to provide superior
performance over conventional training approaches, the simultaneous
optimization of network performance and architecture will almost
always result in a slow training process due to the added algorithmic
complexity. In this paper, we present a geometrical measure based on
the singular value decomposition (SVD) to estimate the necessary
number of neurons to be used in training a single-hidden-layer
feedforward neural network (SLFN). In addition, we develop a new
hybrid multiobjective evolutionary approach that includes the features
of a variable length representation that allow for easy adaptation of
neural networks structures, an architectural recombination procedure
based on the geometrical measure that adapts the number of necessary
hidden neurons and facilitates the exchange of neuronal information
between candidate designs, and a microhybrid genetic algorithm (mu
HGA) with an adaptive local search intensity scheme for local
fine-tuning. In addition, the performances of well-known algorithms as
well as the effectiveness and contributions of the proposed approach
are analyzed and validated through a variety of data set types.

2008

* 379(<-274): Computational algorithms inspired by biological processes and evolution

In recent times computational algorithms inspired by biological
processes and evolution are gaining much popularity for solving
science and engineering problems. These algorithms are broadly
classified into evolutionary computation and swarm intelligence
algorithms, which are derived based on the analogy of natural
evolution and biological activities. These include genetic algorithms,
genetic programming, differential evolution, particle swarm
optimization, ant colony optimization, artificial neural networks,
etc. The algorithms being random-search techniques, use some
heuristics to guide the search towards optimal solution and speed-up
the convergence to obtain the global optimal solutions. The
bio-inspired methods have several attractive features and advantages
compared to conventional optimization solvers. They also facilitate
the advantage of simulation and optimization environment
simultaneously to solve hard-to-define (in simple expressions),
real-world problems. These biologically inspired methods have provided
novel ways of problem-solving for practical problems in traffic
routing, networking, games, industry, robotics, economics, mechanical,
chemical, electrical, civil, water resources and others fields. This
article discusses the key features and development of bio-inspired
computational algorithms, and their scope for application in science
and engineering fields.

2012

* 527(<-270): Structure optimization of neural network for dynamic system modeling using multi-objective genetic algorithm

The problem of constructing an adequate and parsimonious neural
network topology for modeling non-linear dynamic system is studied and
investigated. Neural networks have been shown to perform function
approximation and represent dynamic systems. The network structures
are usually guessed or selected in accordance with the designer's
prior knowledge. However, the multiplicity of the model parameters
makes it troublesome to get an optimum structure. In this paper, an
alternative algorithm based on a multi-objective optimization
algorithm is proposed. The developed neural network model should
fulfil two criteria or objectives namely good predictive accuracy and
minimum model structure. The result shows that the proposed algorithm
is able to identify simulated examples correctly, and identifies the
adequate model for real process data based on a set of solutions
called the Pareto optimal set, from which the best network can be
selected.

2012

* 535(<-290): A Compact Optical Instrument with Artificial Neural Network for pH Determination

The aim of this work was the determination of pH with a sensor
array-based optical portable instrument. This sensor array consists of
eleven membranes with selective colour changes at different pH
intervals. The method for the pH calculation is based on the
implementation of artificial neural networks that use the responses of
the membranes to generate a final pH value. A multi-objective
algorithm was used to select the minimum number of sensing elements
required to achieve an accurate pH determination from the neural
network, and also to minimise the network size. This helps to minimise
instrument and array development costs and save on microprocessor
energy consumption. A set of artificial neural networks that fulfils
these requirements is proposed using different combinations of the
membranes in the sensor array, and is evaluated in terms of accuracy
and reliability. In the end, the network including the response of the
eleven membranes in the sensor was selected for validation in the
instrument prototype because of its high accuracy. The performance of
the instrument was evaluated by measuring the pH of a large set of
real samples, showing that high precision can be obtained in the full
range.

2012

* 587(<- 21): ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS

The estimation of prediction intervals (PIs) is a major issue limiting
the use of Artificial Neural Networks (ANN) solutions for operational
streamflow forecasting. Recently, a Lower Upper Bound Estimation
(LUBE) method has been proposed that outperforms traditional
techniques for ANN-based PI estimation. This method construct ANNs
with two output neurons that directly approximate the lower and upper
bounds of the PIs. The training is performed by minimizing a coverage
width-based criterion (CWC), which is a compound, highly nonlinear and
discontinuous function. In this work, we test the suitability of the
LUBE approach in producing Pis at different confidence levels (CL) for
the 6 h ahead streamflow discharges of the Susquehanna and Nehalem
Rivers, US. Due to the success of Particle Swarm Optimization (PSO) in
LUBE applications, variants of this algorithm have been employed for
CWC minimization. The results obtained are found to vary substantially
depending on the chosen PSO paradigm. While the returned PIs are poor
when single-objective swarm optimization is employed, substantial
improvements are recorded when a multi-objective framework is
considered for ANN development. In particular, the Multi-Objective
Fully Informed Particle Swarm (MOFIPS) optimization algorithm is found
to return valid PIs for both rivers and for the three CL considered of
90%, 95% and 99%. With average PI widths ranging from a minimum of 7%
to a maximum of 15% of the range of the streamflow data in the test
datasets, MOFIPS-based LUBE represents a viable option for
straightforward design of more reliable interval-based streamflow
forecasting models. (C) 2015 Elsevier Ltd. All rights reserved.

2015


* DONE READING:

*  48(<- 26): Joint model for feature selection and parameter optimization coupled with classifier ensemble in chemical mention recognition

(excluded from dissertation(?), because not MLP; good readings otherwise...)

Mention recognition in chemical texts plays an important role in a
wide-spread range of application areas. Feature selection and
parameter optimization are the two important issues in machine
learning. While the former improves the quality of a classifier by
removing the redundant and irrelevant features, the later concerns
finding the most suitable parameter values, which have significant
impact on the overall classification performance. In this paper we
formulate a joint model that performs feature selection and parameter
optimization simultaneously, and propose two approaches based on the
concepts of single and multiobjective optimization
techniques. Classifier ensemble techniques are also employed to
improve the performance further. We identify and implement variety of
features that are mostly domain-independent. Experiments are performed
with various configurations on the benchmark patent and Medline
datasets. Evaluation shows encouraging performance in all the
settings. (C) 2015 Elsevier B.V. All rights reserved.

2015

*  69(<-369): Classification as Clustering: A Pareto Cooperative-Competitive GP Approach

(might be included or excluded; GP and discriminative functions
instead of MLP)

Intuitively population based algorithms such as genetic programming
provide a natural environment for supporting solutions that learn to
decompose the overall task between multiple individuals, or a
team. This work presents a framework for evolving teams without
recourse to prespecifying the number of cooperating individuals. To do
so, each individual evolves a mapping to a distribution of outcomes
that, following clustering, establishes the parameterization of a
(Gaussian) local membership function. This gives individuals the
opportunity to represent subsets of tasks, where the overall task is
that of classification under the supervised learning domain. Thus,
rather than each team member representing an entire class, individuals
are free to identify unique subsets of the overall classification
task. The framework is supported by techniques from evolutionary
multiobjective optimization (EMO) and Pareto competitive
coevolution. EMO establishes the basis for encouraging individuals to
provide accurate yet nonoverlaping behaviors; whereas competitive
coevolution provides the mechanism for scaling to potentially large
unbalanced datasets. Benchmarking is performed against recent examples
of nonlinear SVM classifiers over 12 UCI datasets with between 150 and
200,000 training instances. Solutions from the proposed coevolutionary
multiobjective GP framework appear to provide a good balance between
classification performance and model complexity, especially as the
dataset instance count increases.

2011

* 246(<-560): Many-objective training of a multi-layer perceptron

(excluded on grounds of bad quality and non-trustworthy reporting)

In this paper, a many-objective training scheme for a multi-layer
feed-forward neural network is studied. In this scheme, each training
data set, or the average over sub-sets of the training data, provides
a single objective. A recently proposed group of evolutionary
many-objective optimization algorithms based on the NSGA-II algorithm
have been examined with respect to the handling of such problem
cases. A modified NSGA-II algorithm, using the norm of an individual
as a secondary ranking assignment method, appeared to give the best
results, even for a large number of objectives (up to 50 in this
study). However, there was no notable increase in performance against
the standard backpropagation algorithm, and a remarkable drop in
performance for higher-dimensional feature spaces (dimension 30 in
this study).

2007

* 319(<-243): Memetic Pareto differential evolutionary neural network used to solve an unbalanced liver transplantation problem

Donor-recipient matching constitutes a complex scenario difficult to
model. The risk of subjectivity and the likelihood of falling into
error must not be underestimated. Computational tools for the
decision-making process in liver transplantation can be useful,
despite the inherent complexity involved. Therefore, a multi-objective
evolutionary algorithm and various techniques to select individuals
from the Pareto front are used in this paper to obtain artificial
neural network models to aid decision making. Moreover, a combination
of two pre-processing methods has been applied to the dataset to
offset the existing imbalance. One of them is a resampling method and
the other is a outlier deletion method. The best model obtained with
these procedures (with AUC = 0.66) give medical experts a probability
of graft survival at 3 months after the operation. This probability
can help medical experts to achieve the best possible decision without
forgetting the principles of fairness, efficiency and equity.

2013

* 344(<-404): Multiple criteria optimization-based data mining methods and applications: a systematic survey

(excluded; not relevant; too far from MLP)

Support Vector Machine, an optimization technique, is well known in
the data mining community. In fact, many other optimization techniques
have been effectively used in dealing with data separation and
analysis. For the last 10 years, the author and his colleagues have
proposed and extended a series of optimization-based classification
models via Multiple Criteria Linear Programming (MCLP) and Multiple
Criteria Quadratic Programming (MCQP). These methods are different
from statistics, decision tree induction, and neural networks. The
purpose of this paper is to review the basic concepts and frameworks
of these methods and promote the research interests in the data mining
community. According to the evolution of multiple criteria
programming, the paper starts with the bases of MCLP. Then, it further
discusses penalized MCLP, MCQP, Multiple Criteria Fuzzy Linear
Programming (MCFLP), Multi-Class Multiple Criteria Programming
(MCMCP), and the kernel-based Multiple Criteria Linear Program, as
well as MCLP-based regression. This paper also outlines several
applications of Multiple Criteria optimization-based data mining
methods, such as Credit Card Risk Analysis, Classification of HIV-1
Mediated Neuronal Dendritic and Synaptic Damage, Network Intrusion
Detection, Firm Bankruptcy Prediction, and VIP E-Mail Behavior
Analysis.

2010