-*- mode: org -*-
Focus strictly on SVM learning
*   3(<-596): MOP/GP models for machine learning

Techniques for machine learning have been extensively studied in
recent years as effective tools in data mining. Although there have
been several approaches to machine learning, we focus on the
mathematical programming (in particular, multi-objective and goal
programming; MOP/GP) approaches in this paper. Among them, Support
Vector Machine (SVM) is gaining much popularity recently. In pattern
classification problems with two class sets, its idea is to find a
maximal margin separating hyperplane which gives the greatest
separation between the classes in a high dimensional feature
space. This task is performed by solving a quadratic programming
problem in a traditional formulation, and can be reduced to solving a
linear programming in another formulation. However, the idea of
maximal margin separation is not quite new: in the 1960s the
multi-surface method (MSM) was suggested by Mangasarian. In the 1980s,
linear classifiers using goal programming were developed
extensively. This paper presents an overview on how effectively MOP/GP
techniques can be applied to machine learning such as SVM, and
discusses their problems. (c) 2004 Elsevier B.V. All rights reserved.

2005


*   4(<-614): Study on Support Vector Machines Using Mathematical Programming

Machine learning has been extensively studied in recent years as
eective tools inpattern classication problem. Although there have been
several approaches to machinelearning, we focus on the mathematical
programming (in particular, multi-objective andgoal programming;
MOP/GP) approaches in this paper. Among them, Support VectorMachine
(SVM) is gaining much popularity recently. In pattern classication
problemwith two class sets, the idea is to nd a maximal margin
separating hyperplane whichgives the greatest separation between the
classes in a high dimensional feature space.However, the idea of
maximal margin separation is not quite new: in 1960's the
multi-surface method (MSM) was suggested by Mangasarian. In 1980's,
linear classiersusing goal programming were developed
extensively. This paper proposes a new familyof SVM using MOP/GP
techniques, and discusses its eectiveness throughout severalnumerical
experiments.

2005


*   6(<-465): A Multiobjective Genetic SVM Approach for Classification Problems With Limited Training Samples

In this paper, a novel method for semsupervised classification with
limited training samples is presented. Its aim is to exploit unlabeled
data available at zero cost in the image under analysis for improving
the accuracy of a classification process based on support vector
machines (SVMs). It is based on the idea to augment the original set
of training samples with a set of unlabeled samples after estimating
their label. The label estimation process is performed within a
multiobjective genetic optimization framework where each chromosome of
the evolving population encodes the label estimates as well as the SVM
classifier parameters for tackling the model selection issue. Such a
process is guided by the joint minimization of two different criteria
which express the generalization capability of the SVM classifier. The
two explored criteria are an empirical risk measure and an indicator
of the classification model sparseness, respectively. The experimental
results obtained on two multisource remote sensing data sets confirm
the promising capabilities of the proposed approach, which allows the
following: 1) taking a clear advantage in terms of classification
accuracy from unlabeled samples used for inflating the original
training set and 2) solving automatically the tricky model selection
issue.

2009


*   8(<-514): Genetic SVM approach to semisupervised multitemporal classification

The updating of classification maps, as new image acquisitions are
obtained, raises the problem of ground-truth information (training
samples) updating. In this context, semisupervised multitemporal
classification represents an interesting though still not well
consolidated approach to tackle this issue. In this letter, we propose
a novel methodological solution based on this approach. Its underlying
idea is to update the ground-truth information through an automatic
estimation process, which exploits archived ground-truth information
as well as basic indications from the user about allowed/forbidden
class transitions from an acquisition date to another. This updating
problem is formulated by means of the support vector machine
classification approach and a constrained multiobjective optimization
genetic algorithm. Experimental results on a multitemporal data set
consisting of two multisensor (Landsat-5 Thematic Mapper and European
Remote Sensing satellite synthetic aperture radar) images are reported
and discussed.

2008


*  13(<- 90): A hybrid meta-learning architecture for multi-objective optimization of SVM parameters

Support Vector Machines (SVMs) have achieved a considerable attention
due to their theoretical foundations and good empirical performance
when compared to other learning algorithms in different
applications. However, the SVM performance strongly depends on the
adequate calibration of its parameters. In this work we proposed a
hybrid multi-objective architecture which combines meta-learning (ML)
with multi-objective particle swarm optimization algorithms for the
SVM parameter selection problem. Given an input problem, the proposed
architecture uses a ML technique to suggest an initial Pareto front of
SVM configurations based on previous similar learning problems; the
suggested Pareto front is then refined by a multi-objective
optimization algorithm. In this combination, solutions provided by ML
are possibly located in good regions in the search space. Hence, using
a reduced number of successful candidates, the search process would
converge faster and be less expensive. In the performed experiments,
the proposed solution was compared to traditional multi-objective
algorithms with random initialization, obtaining Pareto fronts with
higher quality on a set of 100 classification problems. (C) 2014
Elsevier B.V. All rights reserved.

2014


*  19(<-585): Multiobjective analysis of chaotic dynamic systems with sparse learning machines

Sparse learning machines provide a viable framework for modeling
chaotic time-series systems. A powerful state-space reconstruction
methodology using both support vector machines (SVM) and relevance
vector machines (RVM) within a multiobjective optimization framework
is presented in this paper. The utility and practicality of the
proposed approaches have been demonstrated on the time series of the
Great Salt Lake (GSL) biweekly volumes from 1848 to 2004. A comparison
of the two methods is made based on their predictive power and
robustness. The reconstruction of-the dynamics of the Great Salt Lake
volume time series is attained using the most relevant feature subset
of the training data. In this paper, efforts are also made to assess
the uncertainty and robustness of the machines in learning and
forecasting as a function of model structure, model parameters, and
bootstrapping samples. The resulting model will normally have a
structure, including parameterization, that suits the information
content of the available data, and can be used to develop time series
forecasts for multiple lead times ranging from two weeks to several
months. (c) 2005 Elsevier Ltd. All rights reserved.

2006


*  20(<-157): Leave-one-out cross-validation-based model selection for multi-input multi-output support vector machine

As an effective approach for multi-input multi-output regression
estimation problems, a multi-dimensional support vector regression
(SVR), named M-SVR, is generally capable of obtaining better
predictions than applying a conventional support vector machine (SVM)
independently for each output dimension. However, although there are
many generalization error bounds for conventional SVMs, all of them
cannot be directly applied to M-SVR. In this paper, a new
leave-one-out (LOO) error estimate for M-SVR is derived firstly
through a virtual LOO cross-validation procedure. This LOO error
estimate can be straightway calculated once a training process ended
with less computational complexity than traditional LOO method. Based
on this LOO estimate, a new model selection methods for M-SVR based on
multi-objective optimization strategy is further proposed in this
paper. Experiments on toy noisy function regression and practical
engineering data set, that is, dynamic load identification on cylinder
vibration system, are both conducted, demonstrating comparable results
of the proposed method in terms of generalization performance and
computational cost.

2014


*  23(<-609): Multi-objective model selection for support vector machines

In this article, model selection for support vector machines is viewed
as a multi-objective optimization problem, where model complexity and
training accuracy define two conflicting objectives. Different
optimization criteria are evaluated: Split modified radius margin
bounds, which allow for comparing existing model selection criteria,
and the training error in conjunction with the number of support
vectors for designing sparse solutions.

2005


*  26(<-150): A novel feature selection method for twin support vector machine

Both support vector machine (SVM) and twin support vector machine
(TWSVM) are powerful classification tools. However, in contrast to
many SVM-based feature selection methods, TWSVM has not any
corresponding one due to its different mechanism up to now. In this
paper, we propose a feature selection method based on TWSVM, called
FTSVM. It is interesting because of the advantages of TWSVM in many
cases. Our FTSVM is quite different from the SVM-based feature
selection methods. In fact, linear SVM constructs a single separating
hyperplane which corresponds a single weight for each feature, whereas
linear TWSVM constructs two fitting hyperplanes which corresponds to
two weights for each feature. In our linear FTSVM, in order to link
these two fitting hyperplanes, a feature selection matrix is
introduced. Thus, the feature selection becomes to find an optimal
matrix, leading to solve a multi-objective mixed-integer programming
problem by a greedy algorithm. In addition, the linear FTSVM has been
extended to the nonlinear case. Furthermore, a feature ranking
strategy based on FTSVM is also suggested. The experimental results on
several public available benchmark datasets indicate that our FTSVM
not only gives nice feature selection on both linear and nonlinear
cases but also improves the performance of TWSVM efficiently. (C) 2014
Elsevier B.V. All rights reserved.

2014


*  30(<-  8): Novel approaches using evolutionary computation for sparse least square support vector machines

This paper introduces two new approaches to building sparse least
square support vector machines (LSSVM) based on genetic algorithms
(GAs) for classification tasks. LSSVM classifiers are an alternative
to SVM ones because the training process of LSSVM classifiers only
requires to solve a linear equation system instead of a quadratic
programming optimization problem. However, the absence of sparseness
in the Lagrange multiplier vector (i.e. the solution) is a significant
problem for the effective use of these classifiers. In order to
overcome this lack of sparseness, we propose both single and
multi-objective GA approaches to leave a few support vectors out of
the solution without affecting the classifier's accuracy and even
improving it. The main idea is to leave out outliers, non-relevant
patterns or those ones which can be corrupted with noise and thus
prevent classifiers to achieve higher accuracies along with a reduced
set of support vectors. Differently from previous works, genetic
algorithms are used in this work to obtain sparseness not to find out
the optimal values of the LSSVM hyper-parameters. (C) 2015 Elsevier
B.V. All rights reserved.

2015


*  31(<-586): Additive preference model with piecewise linear components resulting from Dominance-based Rough Set Approximations

Dominance-based Rough Set Approach (DRSA) has been proposed for
multi-criteria classification problems in order to handle
inconsistencies in the input information with respect to the dominance
principle. The end result of DRSA is a decision rule model of Decision
Maker preferences. In this paper, we consider an additive function
model resulting from dominance-based rough approximations. The
presented approach is similar to UTA and UTADTS methods. However, we
define a goal function of the optimization problem in a similar way as
it is done in Support Vector Machines (SVM). The problem may. also be
defined as the one of searching for linear value functions in a
transformed feature space obtained by exhaustive binarization of
criteria.

2006


*  37(<- 64): Surrogate-assisted multi-objective model selection for support vector machines

Classification is one of the most well-known tasks in supervised
learning. A vast number of algorithms for pattern classification have
been proposed so far. Among these, support vector machines (SVMs) are
one of the most popular approaches, due to the high performance
reached by these methods in a wide number of pattern recognition
applications. Nevertheless, the effectiveness of SVMs highly depends
on their hyper-parameters. Besides the fine-tuning of their
hyper-parameters, the way in which the features are scaled as well as
the presence of non-relevant features could affect their
generalization performance. This paper introduces an approach for
addressing model selection for support vector machines used in
classification tasks. In our formulation, a model can be composed of
feature selection and pre-processing methods besides the SVM
classifier. We formulate the model selection problem as a
multi-objective one, aiming to minimize simultaneously two components
that are closely related to the error of a model: bias and variance
components, which are estimated in an experimental fashion. A
surrogate-assisted evolutionary multi-objective optimization approach
is adopted to explore the hyper-parameters space. We adopted this
approach due to the fact that estimating the bias and variance could
be computationally expensive. Therefore, by using surrogate-assisted
optimization, we expect to reduce the number of solutions evaluated by
the fitness functions so that the computational cost would also be
reduced. Experimental results conducted on benchmark datasets widely
used in the literature, indicate that highly competitive models with a
fewer number of fitness function evaluations are obtained by our
proposal when it is compared to state of the art model selection
methods. (C) 2014 Elsevier B.V. All rights reserved.

2015


*  41(<-580): Multi-objective parameters selection for SVM classification using NSGA-II

Selecting proper parameters is an important issue to extend the
classification ability of Support Vector Machine (SVM), which makes
SVM practically useful. Genetic Algorithm (CA) has been widely applied
to solve the problem of parameters selection for SVM classification
due to its ability to discover good solutions quickly for complex
searching and optimization problems. However, traditional CA in this
field relys on single generalization error bound as fitness function
to select parameters. Since there have several generalization error
bounds been developed, picking and using single criterion as fitness
function seems intractable and insufficient. Motivated by the
multi-objective optimization problems, this paper introduces an
efficient method of parameters selection for SVM classification based
on multi-objective evolutionary algorithm NSCA-II. We also introduce
an adaptive mutation rate for NSGA-II. Experiment results show that
our method is better than single-objective approaches, especially in
the case of tiny training sets with large testing sets.

2006


*  44(<-425): A multi-model selection framework for unknown and/or evolutive misclassification cost problems

In this paper, we tackle the problem of model selection when
misclassification costs are unknown and/or may evolve. Unlike
traditional approaches based on a scalar optimization, we propose a
generic multimodel selection framework based on a multi-objective
approach. The idea is to automatically train a pool of classifiers
instead of one single classifier, each classifier in the pool
optimizing a particular trade-off between the objectives. Within the
context of two-class classification problems, we introduce the "ROC
front concept" as an alternative to the ROC curve representation. This
strategy is applied to the multimodel selection of SVM classifiers
using an evolutionary multi-objective optimization algorithm. The
comparison with a traditional scalar optimization technique based on
an AUC criterion shows promising results on UCl datasets as well as on
a real-world classification problem. (C) 2009 Elsevier Ltd. All rights
reserved.

2010


*  45(<-429): NONCOST SENSITIVE SVM TRAINING USING MULTIPLE MODEL SELECTION

In this paper, we propose a multi-objective optimization framework for
SVM hyperparameters tuning. The key idea is to manage a population of
classifiers optimizing both False Positive and True Positive rates
rather than a single classifier optimizing a scalar criterion. Hence,
each classifier in the population optimizes a particular trade-off
between the objectives. Within the context of two-class classification
problems, our work introduces "the receiver operating characteristics
(ROC) front concept" depicting a population of SVM classifiers as an
alternative to the receiver operating characteristics (ROC) curve
representation. The proposed framework leads to a noncost sensitive
SVM training relying on the pool of classifiers. The comparison with a
traditional scalar optimization technique based on an AUC criterion
shows promising results on UCI datasets.

2010


*  46(<-567): Two-group classification via a biobjective margin maximization model

In this paper we propose a biobjective model for two-group
classification via margin maximization, in which the margins in both
classes are simultaneously maximized. The set of Pareto-optimal
solutions is described, yielding a set of parallel hyperplanes, one of
which is just the solution of the classical SVM approach. In order to
take into account different misclassification costs or a priori
probabilities, the ROC curve can be used to select one out of such
hyperplanes by expressing the adequate tradeoff for sensitivity and
specificity. Our result gives a theoretical motivation for using the
ROC approach in case misclassification costs in the two groups are not
necessarily equal. (c) 2005 Elsevier B.V. All rights reserved.

2006


*  54(<-127): SVM classification for imbalanced data sets using a multiobjective optimization framework

Classification of imbalanced data sets in which negative instances
outnumber the positive instances is a significant challenge. These
data sets are commonly encountered in real-life problems. However,
performance of well-known classifiers is limited in such
cases. Various solution approaches have been proposed for the class
imbalance problem using either data-level or algorithm-level
modifications. Support Vector Machines (SVMs) that have a solid
theoretical background also encounter a dramatic decrease in
performance when the data distribution is imbalanced. In this study,
we propose an L-1-norm SVM approach that is based on a three objective
optimization problem so as to incorporate into the formulation the
error sums for the two classes independently. Motivated by the
inherent multi objective nature of the SVMs, the solution approach
utilizes a reduction into two criteria formulations and investigates
the efficient frontier systematically. The results indicate that a
comprehensive treatment of distinct positive and negative error levels
may lead to performance improvements that have varying degrees of
increased computational effort.

2014


*  66(<-498): Accurate and resource-aware classification based on measurement data

In this paper, we face the problem of designing accurate
decision-making modules in measurement systems that need to be
implemented on resource-constrained platforms. We propose a
methodology based on multiobjective optimization and genetic
algorithms (GAs) for the analysis of support vector machine (SVM)
solutions in the classification error-complexity space. Specific
criteria for the choice of optimal SVM classifiers and experimental
results on both real and synthetic data will also be discussed.

2008


*  71(<-411): Integrating Clustering and Supervised Learning for Categorical Data Analysis

The problem of fuzzy clustering of categorical data, where no natural
ordering among the elements of a categorical attribute domain can be
found, is an important problem in exploratory data analysis. As a
result, a few clustering algorithms with focus on categorical data
have been proposed. In this paper, a modified differential evolution
(DE)-based fuzzy c-medoids (FCMdd) clustering of categorical data has
been proposed. The algorithm combines both local as well as global
information with adaptive weighting. The performance of the proposed
method has been compared with those using genetic algorithm, simulated
annealing, and the classical DE technique, besides the FCMdd, fuzzy
k-modes, and average linkage hierarchical clustering algorithm for
four artificial and four real life categorical data sets. Statistical
test has been carried out to establish the statistical significance of
the proposed method. To improve the result further, the clustering
method is integrated with a support vector machine (SVM), a well-known
technique for supervised learning. A fraction of the data points
selected from different clusters based on their proximity to the
respective medoids is used for training the SVM. The clustering
assignments of the remaining points are thereafter determined using
the trained classifier. The superiority of the integrated clustering
and supervised learning approach has been demonstrated.

2010


*  77(<- 79): Pareto-Path Multitask Multiple Kernel Learning

A traditional and intuitively appealing Multitask Multiple Kernel
Learning (MT-MKL) method is to optimize the sum (thus, the average) of
objective functions with (partially) shared kernel function, which
allows information sharing among the tasks. We point out that the
obtained solution corresponds to a single point on the Pareto Front
(PF) of a multiobjective optimization problem, which considers the
concurrent optimization of all task objectives involved in the
Multitask Learning (MTL) problem. Motivated by this last observation
and arguing that the former approach is heuristic, we propose a novel
support vector machine MT-MKL framework that considers an implicitly
defined set of conic combinations of task objectives. We show that
solving our framework produces solutions along a path on the
aforementioned PF and that it subsumes the optimization of the average
of objective functions as a special case. Using the algorithms we
derived, we demonstrate through a series of experimental results that
the framework is capable of achieving a better classification
performance, when compared with other similar MTL approaches.

2015


* 286(<-576): Using a multi-objective genetic algorithm for SVM construction

Support Vector Machines are kernel machines useful for classification
and regression problems. in this paper, they are used for non-linear
regression of environmental data. From a structural point of view,
Support Vector Machines are particular Artificial Neural Networks and
their training paradigm has some positive implications. in fact, the
original training approach is useful to overcome the curse of
dimensionality and too strict assumptions on statistics of the errors
in data. Support Vector machines and Radial Basis Function Regularised
Networks are presented within a common structural framework for
non-linear regression in order to emphasise the training strategy for
support vector machines and to better explain the multi-objective
approach in support vector machines' construction. A support vector
machine's performance depends on the kernel parameter, input selection
and epsilon-tube optimal dimension. These will be used as decision
variables for the evolutionary strategy based on a Genetic Algorithm,
which exhibits the number of support vectors, for the capacity of
machine, and the fitness to a validation subset, for the model
accuracy in mapping the underlying physical phenomena, as objective
functions. The strategy is tested on a case study dealing with
groundwater modelling, based on time series (past measured rainfalls
and levels) for level predictions at variable time horizons.

2006


* 324(<-377): A multi-objective artificial immune algorithm for parameter optimization in support vector machine

Support vector machine (SVM) is a classification method based on the
structured risk minimization principle. Penalize, C; and kernel, sigma
parameters of SVM must be carefully selected in establishing an
efficient SVM model. These parameters are selected by trial and error
or man's experience. Artificial immune system (AIS) can be defined as
a soft computing method inspired by theoretical immune system in order
to solve science and engineering problems. A multi-objective
artificial immune algorithm has been used to optimize the kernel and
penalize parameters of SVM in this paper. In training stage of SVM,
multiple solutions are found by using multi-objective artificial
immune algorithm and then these parameters are evaluated in test
stage. The proposed algorithm is applied to fault diagnosis of
induction motors and anomaly detection problems and successful results
are obtained. (c) 2009 Elsevier B.V. All rights reserved.

2011


* 326(<-687): USING GENETIC ALGORITHMS FOR AN ARTIFICIAL NEURAL-NETWORK MODEL INVERSION

Genetic algorithms (GAs) and artificial neural networks (ANNs) are
techniques for optimization and learning, respectively, which both
have been adopted from nature. Their main advantage over traditional
techniques is the relatively better performance when applied to
complex relations. GAs and ANNs are both self-learning systems, i.e.,
they do not require any background knowledge from the creator. In this
paper, we describe the performance of a GA that finds hypothetical
physical structures of poly(ethylene terephthalate) (PET) yarns
corresponding to a certain combination of mechanical and shrinkage
properties. This GA uses a validated ANN that has been trained for the
complex relation between structure and properties of PET. This
technique was tested by comparing the optimal points found by the GA
with known experimental data under a variety of multi-criteria
conditions.

1993


* 344(<-404): Multiple criteria optimization-based data mining methods and applications: a systematic survey

Support Vector Machine, an optimization technique, is well known in
the data mining community. In fact, many other optimization techniques
have been effectively used in dealing with data separation and
analysis. For the last 10 years, the author and his colleagues have
proposed and extended a series of optimization-based classification
models via Multiple Criteria Linear Programming (MCLP) and Multiple
Criteria Quadratic Programming (MCQP). These methods are different
from statistics, decision tree induction, and neural networks. The
purpose of this paper is to review the basic concepts and frameworks
of these methods and promote the research interests in the data mining
community. According to the evolution of multiple criteria
programming, the paper starts with the bases of MCLP. Then, it further
discusses penalized MCLP, MCQP, Multiple Criteria Fuzzy Linear
Programming (MCFLP), Multi-Class Multiple Criteria Programming
(MCMCP), and the kernel-based Multiple Criteria Linear Program, as
well as MCLP-based regression. This paper also outlines several
applications of Multiple Criteria optimization-based data mining
methods, such as Credit Card Risk Analysis, Classification of HIV-1
Mediated Neuronal Dendritic and Synaptic Damage, Network Intrusion
Detection, Firm Bankruptcy Prediction, and VIP E-Mail Behavior
Analysis.

2010