-*- mode: org -*- Focus strictly on SVM learning * 3(<-596): MOP/GP models for machine learning Techniques for machine learning have been extensively studied in recent years as effective tools in data mining. Although there have been several approaches to machine learning, we focus on the mathematical programming (in particular, multi-objective and goal programming; MOP/GP) approaches in this paper. Among them, Support Vector Machine (SVM) is gaining much popularity recently. In pattern classification problems with two class sets, its idea is to find a maximal margin separating hyperplane which gives the greatest separation between the classes in a high dimensional feature space. This task is performed by solving a quadratic programming problem in a traditional formulation, and can be reduced to solving a linear programming in another formulation. However, the idea of maximal margin separation is not quite new: in the 1960s the multi-surface method (MSM) was suggested by Mangasarian. In the 1980s, linear classifiers using goal programming were developed extensively. This paper presents an overview on how effectively MOP/GP techniques can be applied to machine learning such as SVM, and discusses their problems. (c) 2004 Elsevier B.V. All rights reserved. 2005 * 4(<-614): Study on Support Vector Machines Using Mathematical Programming Machine learning has been extensively studied in recent years as eective tools inpattern classication problem. Although there have been several approaches to machinelearning, we focus on the mathematical programming (in particular, multi-objective andgoal programming; MOP/GP) approaches in this paper. Among them, Support VectorMachine (SVM) is gaining much popularity recently. In pattern classication problemwith two class sets, the idea is to nd a maximal margin separating hyperplane whichgives the greatest separation between the classes in a high dimensional feature space.However, the idea of maximal margin separation is not quite new: in 1960's the multi-surface method (MSM) was suggested by Mangasarian. In 1980's, linear classiersusing goal programming were developed extensively. This paper proposes a new familyof SVM using MOP/GP techniques, and discusses its eectiveness throughout severalnumerical experiments. 2005 * 6(<-465): A Multiobjective Genetic SVM Approach for Classification Problems With Limited Training Samples In this paper, a novel method for semsupervised classification with limited training samples is presented. Its aim is to exploit unlabeled data available at zero cost in the image under analysis for improving the accuracy of a classification process based on support vector machines (SVMs). It is based on the idea to augment the original set of training samples with a set of unlabeled samples after estimating their label. The label estimation process is performed within a multiobjective genetic optimization framework where each chromosome of the evolving population encodes the label estimates as well as the SVM classifier parameters for tackling the model selection issue. Such a process is guided by the joint minimization of two different criteria which express the generalization capability of the SVM classifier. The two explored criteria are an empirical risk measure and an indicator of the classification model sparseness, respectively. The experimental results obtained on two multisource remote sensing data sets confirm the promising capabilities of the proposed approach, which allows the following: 1) taking a clear advantage in terms of classification accuracy from unlabeled samples used for inflating the original training set and 2) solving automatically the tricky model selection issue. 2009 * 8(<-514): Genetic SVM approach to semisupervised multitemporal classification The updating of classification maps, as new image acquisitions are obtained, raises the problem of ground-truth information (training samples) updating. In this context, semisupervised multitemporal classification represents an interesting though still not well consolidated approach to tackle this issue. In this letter, we propose a novel methodological solution based on this approach. Its underlying idea is to update the ground-truth information through an automatic estimation process, which exploits archived ground-truth information as well as basic indications from the user about allowed/forbidden class transitions from an acquisition date to another. This updating problem is formulated by means of the support vector machine classification approach and a constrained multiobjective optimization genetic algorithm. Experimental results on a multitemporal data set consisting of two multisensor (Landsat-5 Thematic Mapper and European Remote Sensing satellite synthetic aperture radar) images are reported and discussed. 2008 * 13(<- 90): A hybrid meta-learning architecture for multi-objective optimization of SVM parameters Support Vector Machines (SVMs) have achieved a considerable attention due to their theoretical foundations and good empirical performance when compared to other learning algorithms in different applications. However, the SVM performance strongly depends on the adequate calibration of its parameters. In this work we proposed a hybrid multi-objective architecture which combines meta-learning (ML) with multi-objective particle swarm optimization algorithms for the SVM parameter selection problem. Given an input problem, the proposed architecture uses a ML technique to suggest an initial Pareto front of SVM configurations based on previous similar learning problems; the suggested Pareto front is then refined by a multi-objective optimization algorithm. In this combination, solutions provided by ML are possibly located in good regions in the search space. Hence, using a reduced number of successful candidates, the search process would converge faster and be less expensive. In the performed experiments, the proposed solution was compared to traditional multi-objective algorithms with random initialization, obtaining Pareto fronts with higher quality on a set of 100 classification problems. (C) 2014 Elsevier B.V. All rights reserved. 2014 * 19(<-585): Multiobjective analysis of chaotic dynamic systems with sparse learning machines Sparse learning machines provide a viable framework for modeling chaotic time-series systems. A powerful state-space reconstruction methodology using both support vector machines (SVM) and relevance vector machines (RVM) within a multiobjective optimization framework is presented in this paper. The utility and practicality of the proposed approaches have been demonstrated on the time series of the Great Salt Lake (GSL) biweekly volumes from 1848 to 2004. A comparison of the two methods is made based on their predictive power and robustness. The reconstruction of-the dynamics of the Great Salt Lake volume time series is attained using the most relevant feature subset of the training data. In this paper, efforts are also made to assess the uncertainty and robustness of the machines in learning and forecasting as a function of model structure, model parameters, and bootstrapping samples. The resulting model will normally have a structure, including parameterization, that suits the information content of the available data, and can be used to develop time series forecasts for multiple lead times ranging from two weeks to several months. (c) 2005 Elsevier Ltd. All rights reserved. 2006 * 20(<-157): Leave-one-out cross-validation-based model selection for multi-input multi-output support vector machine As an effective approach for multi-input multi-output regression estimation problems, a multi-dimensional support vector regression (SVR), named M-SVR, is generally capable of obtaining better predictions than applying a conventional support vector machine (SVM) independently for each output dimension. However, although there are many generalization error bounds for conventional SVMs, all of them cannot be directly applied to M-SVR. In this paper, a new leave-one-out (LOO) error estimate for M-SVR is derived firstly through a virtual LOO cross-validation procedure. This LOO error estimate can be straightway calculated once a training process ended with less computational complexity than traditional LOO method. Based on this LOO estimate, a new model selection methods for M-SVR based on multi-objective optimization strategy is further proposed in this paper. Experiments on toy noisy function regression and practical engineering data set, that is, dynamic load identification on cylinder vibration system, are both conducted, demonstrating comparable results of the proposed method in terms of generalization performance and computational cost. 2014 * 23(<-609): Multi-objective model selection for support vector machines In this article, model selection for support vector machines is viewed as a multi-objective optimization problem, where model complexity and training accuracy define two conflicting objectives. Different optimization criteria are evaluated: Split modified radius margin bounds, which allow for comparing existing model selection criteria, and the training error in conjunction with the number of support vectors for designing sparse solutions. 2005 * 26(<-150): A novel feature selection method for twin support vector machine Both support vector machine (SVM) and twin support vector machine (TWSVM) are powerful classification tools. However, in contrast to many SVM-based feature selection methods, TWSVM has not any corresponding one due to its different mechanism up to now. In this paper, we propose a feature selection method based on TWSVM, called FTSVM. It is interesting because of the advantages of TWSVM in many cases. Our FTSVM is quite different from the SVM-based feature selection methods. In fact, linear SVM constructs a single separating hyperplane which corresponds a single weight for each feature, whereas linear TWSVM constructs two fitting hyperplanes which corresponds to two weights for each feature. In our linear FTSVM, in order to link these two fitting hyperplanes, a feature selection matrix is introduced. Thus, the feature selection becomes to find an optimal matrix, leading to solve a multi-objective mixed-integer programming problem by a greedy algorithm. In addition, the linear FTSVM has been extended to the nonlinear case. Furthermore, a feature ranking strategy based on FTSVM is also suggested. The experimental results on several public available benchmark datasets indicate that our FTSVM not only gives nice feature selection on both linear and nonlinear cases but also improves the performance of TWSVM efficiently. (C) 2014 Elsevier B.V. All rights reserved. 2014 * 30(<- 8): Novel approaches using evolutionary computation for sparse least square support vector machines This paper introduces two new approaches to building sparse least square support vector machines (LSSVM) based on genetic algorithms (GAs) for classification tasks. LSSVM classifiers are an alternative to SVM ones because the training process of LSSVM classifiers only requires to solve a linear equation system instead of a quadratic programming optimization problem. However, the absence of sparseness in the Lagrange multiplier vector (i.e. the solution) is a significant problem for the effective use of these classifiers. In order to overcome this lack of sparseness, we propose both single and multi-objective GA approaches to leave a few support vectors out of the solution without affecting the classifier's accuracy and even improving it. The main idea is to leave out outliers, non-relevant patterns or those ones which can be corrupted with noise and thus prevent classifiers to achieve higher accuracies along with a reduced set of support vectors. Differently from previous works, genetic algorithms are used in this work to obtain sparseness not to find out the optimal values of the LSSVM hyper-parameters. (C) 2015 Elsevier B.V. All rights reserved. 2015 * 31(<-586): Additive preference model with piecewise linear components resulting from Dominance-based Rough Set Approximations Dominance-based Rough Set Approach (DRSA) has been proposed for multi-criteria classification problems in order to handle inconsistencies in the input information with respect to the dominance principle. The end result of DRSA is a decision rule model of Decision Maker preferences. In this paper, we consider an additive function model resulting from dominance-based rough approximations. The presented approach is similar to UTA and UTADTS methods. However, we define a goal function of the optimization problem in a similar way as it is done in Support Vector Machines (SVM). The problem may. also be defined as the one of searching for linear value functions in a transformed feature space obtained by exhaustive binarization of criteria. 2006 * 37(<- 64): Surrogate-assisted multi-objective model selection for support vector machines Classification is one of the most well-known tasks in supervised learning. A vast number of algorithms for pattern classification have been proposed so far. Among these, support vector machines (SVMs) are one of the most popular approaches, due to the high performance reached by these methods in a wide number of pattern recognition applications. Nevertheless, the effectiveness of SVMs highly depends on their hyper-parameters. Besides the fine-tuning of their hyper-parameters, the way in which the features are scaled as well as the presence of non-relevant features could affect their generalization performance. This paper introduces an approach for addressing model selection for support vector machines used in classification tasks. In our formulation, a model can be composed of feature selection and pre-processing methods besides the SVM classifier. We formulate the model selection problem as a multi-objective one, aiming to minimize simultaneously two components that are closely related to the error of a model: bias and variance components, which are estimated in an experimental fashion. A surrogate-assisted evolutionary multi-objective optimization approach is adopted to explore the hyper-parameters space. We adopted this approach due to the fact that estimating the bias and variance could be computationally expensive. Therefore, by using surrogate-assisted optimization, we expect to reduce the number of solutions evaluated by the fitness functions so that the computational cost would also be reduced. Experimental results conducted on benchmark datasets widely used in the literature, indicate that highly competitive models with a fewer number of fitness function evaluations are obtained by our proposal when it is compared to state of the art model selection methods. (C) 2014 Elsevier B.V. All rights reserved. 2015 * 41(<-580): Multi-objective parameters selection for SVM classification using NSGA-II Selecting proper parameters is an important issue to extend the classification ability of Support Vector Machine (SVM), which makes SVM practically useful. Genetic Algorithm (CA) has been widely applied to solve the problem of parameters selection for SVM classification due to its ability to discover good solutions quickly for complex searching and optimization problems. However, traditional CA in this field relys on single generalization error bound as fitness function to select parameters. Since there have several generalization error bounds been developed, picking and using single criterion as fitness function seems intractable and insufficient. Motivated by the multi-objective optimization problems, this paper introduces an efficient method of parameters selection for SVM classification based on multi-objective evolutionary algorithm NSCA-II. We also introduce an adaptive mutation rate for NSGA-II. Experiment results show that our method is better than single-objective approaches, especially in the case of tiny training sets with large testing sets. 2006 * 44(<-425): A multi-model selection framework for unknown and/or evolutive misclassification cost problems In this paper, we tackle the problem of model selection when misclassification costs are unknown and/or may evolve. Unlike traditional approaches based on a scalar optimization, we propose a generic multimodel selection framework based on a multi-objective approach. The idea is to automatically train a pool of classifiers instead of one single classifier, each classifier in the pool optimizing a particular trade-off between the objectives. Within the context of two-class classification problems, we introduce the "ROC front concept" as an alternative to the ROC curve representation. This strategy is applied to the multimodel selection of SVM classifiers using an evolutionary multi-objective optimization algorithm. The comparison with a traditional scalar optimization technique based on an AUC criterion shows promising results on UCl datasets as well as on a real-world classification problem. (C) 2009 Elsevier Ltd. All rights reserved. 2010 * 45(<-429): NONCOST SENSITIVE SVM TRAINING USING MULTIPLE MODEL SELECTION In this paper, we propose a multi-objective optimization framework for SVM hyperparameters tuning. The key idea is to manage a population of classifiers optimizing both False Positive and True Positive rates rather than a single classifier optimizing a scalar criterion. Hence, each classifier in the population optimizes a particular trade-off between the objectives. Within the context of two-class classification problems, our work introduces "the receiver operating characteristics (ROC) front concept" depicting a population of SVM classifiers as an alternative to the receiver operating characteristics (ROC) curve representation. The proposed framework leads to a noncost sensitive SVM training relying on the pool of classifiers. The comparison with a traditional scalar optimization technique based on an AUC criterion shows promising results on UCI datasets. 2010 * 46(<-567): Two-group classification via a biobjective margin maximization model In this paper we propose a biobjective model for two-group classification via margin maximization, in which the margins in both classes are simultaneously maximized. The set of Pareto-optimal solutions is described, yielding a set of parallel hyperplanes, one of which is just the solution of the classical SVM approach. In order to take into account different misclassification costs or a priori probabilities, the ROC curve can be used to select one out of such hyperplanes by expressing the adequate tradeoff for sensitivity and specificity. Our result gives a theoretical motivation for using the ROC approach in case misclassification costs in the two groups are not necessarily equal. (c) 2005 Elsevier B.V. All rights reserved. 2006 * 54(<-127): SVM classification for imbalanced data sets using a multiobjective optimization framework Classification of imbalanced data sets in which negative instances outnumber the positive instances is a significant challenge. These data sets are commonly encountered in real-life problems. However, performance of well-known classifiers is limited in such cases. Various solution approaches have been proposed for the class imbalance problem using either data-level or algorithm-level modifications. Support Vector Machines (SVMs) that have a solid theoretical background also encounter a dramatic decrease in performance when the data distribution is imbalanced. In this study, we propose an L-1-norm SVM approach that is based on a three objective optimization problem so as to incorporate into the formulation the error sums for the two classes independently. Motivated by the inherent multi objective nature of the SVMs, the solution approach utilizes a reduction into two criteria formulations and investigates the efficient frontier systematically. The results indicate that a comprehensive treatment of distinct positive and negative error levels may lead to performance improvements that have varying degrees of increased computational effort. 2014 * 66(<-498): Accurate and resource-aware classification based on measurement data In this paper, we face the problem of designing accurate decision-making modules in measurement systems that need to be implemented on resource-constrained platforms. We propose a methodology based on multiobjective optimization and genetic algorithms (GAs) for the analysis of support vector machine (SVM) solutions in the classification error-complexity space. Specific criteria for the choice of optimal SVM classifiers and experimental results on both real and synthetic data will also be discussed. 2008 * 71(<-411): Integrating Clustering and Supervised Learning for Categorical Data Analysis The problem of fuzzy clustering of categorical data, where no natural ordering among the elements of a categorical attribute domain can be found, is an important problem in exploratory data analysis. As a result, a few clustering algorithms with focus on categorical data have been proposed. In this paper, a modified differential evolution (DE)-based fuzzy c-medoids (FCMdd) clustering of categorical data has been proposed. The algorithm combines both local as well as global information with adaptive weighting. The performance of the proposed method has been compared with those using genetic algorithm, simulated annealing, and the classical DE technique, besides the FCMdd, fuzzy k-modes, and average linkage hierarchical clustering algorithm for four artificial and four real life categorical data sets. Statistical test has been carried out to establish the statistical significance of the proposed method. To improve the result further, the clustering method is integrated with a support vector machine (SVM), a well-known technique for supervised learning. A fraction of the data points selected from different clusters based on their proximity to the respective medoids is used for training the SVM. The clustering assignments of the remaining points are thereafter determined using the trained classifier. The superiority of the integrated clustering and supervised learning approach has been demonstrated. 2010 * 77(<- 79): Pareto-Path Multitask Multiple Kernel Learning A traditional and intuitively appealing Multitask Multiple Kernel Learning (MT-MKL) method is to optimize the sum (thus, the average) of objective functions with (partially) shared kernel function, which allows information sharing among the tasks. We point out that the obtained solution corresponds to a single point on the Pareto Front (PF) of a multiobjective optimization problem, which considers the concurrent optimization of all task objectives involved in the Multitask Learning (MTL) problem. Motivated by this last observation and arguing that the former approach is heuristic, we propose a novel support vector machine MT-MKL framework that considers an implicitly defined set of conic combinations of task objectives. We show that solving our framework produces solutions along a path on the aforementioned PF and that it subsumes the optimization of the average of objective functions as a special case. Using the algorithms we derived, we demonstrate through a series of experimental results that the framework is capable of achieving a better classification performance, when compared with other similar MTL approaches. 2015 * 286(<-576): Using a multi-objective genetic algorithm for SVM construction Support Vector Machines are kernel machines useful for classification and regression problems. in this paper, they are used for non-linear regression of environmental data. From a structural point of view, Support Vector Machines are particular Artificial Neural Networks and their training paradigm has some positive implications. in fact, the original training approach is useful to overcome the curse of dimensionality and too strict assumptions on statistics of the errors in data. Support Vector machines and Radial Basis Function Regularised Networks are presented within a common structural framework for non-linear regression in order to emphasise the training strategy for support vector machines and to better explain the multi-objective approach in support vector machines' construction. A support vector machine's performance depends on the kernel parameter, input selection and epsilon-tube optimal dimension. These will be used as decision variables for the evolutionary strategy based on a Genetic Algorithm, which exhibits the number of support vectors, for the capacity of machine, and the fitness to a validation subset, for the model accuracy in mapping the underlying physical phenomena, as objective functions. The strategy is tested on a case study dealing with groundwater modelling, based on time series (past measured rainfalls and levels) for level predictions at variable time horizons. 2006 * 324(<-377): A multi-objective artificial immune algorithm for parameter optimization in support vector machine Support vector machine (SVM) is a classification method based on the structured risk minimization principle. Penalize, C; and kernel, sigma parameters of SVM must be carefully selected in establishing an efficient SVM model. These parameters are selected by trial and error or man's experience. Artificial immune system (AIS) can be defined as a soft computing method inspired by theoretical immune system in order to solve science and engineering problems. A multi-objective artificial immune algorithm has been used to optimize the kernel and penalize parameters of SVM in this paper. In training stage of SVM, multiple solutions are found by using multi-objective artificial immune algorithm and then these parameters are evaluated in test stage. The proposed algorithm is applied to fault diagnosis of induction motors and anomaly detection problems and successful results are obtained. (c) 2009 Elsevier B.V. All rights reserved. 2011 * 326(<-687): USING GENETIC ALGORITHMS FOR AN ARTIFICIAL NEURAL-NETWORK MODEL INVERSION Genetic algorithms (GAs) and artificial neural networks (ANNs) are techniques for optimization and learning, respectively, which both have been adopted from nature. Their main advantage over traditional techniques is the relatively better performance when applied to complex relations. GAs and ANNs are both self-learning systems, i.e., they do not require any background knowledge from the creator. In this paper, we describe the performance of a GA that finds hypothetical physical structures of poly(ethylene terephthalate) (PET) yarns corresponding to a certain combination of mechanical and shrinkage properties. This GA uses a validated ANN that has been trained for the complex relation between structure and properties of PET. This technique was tested by comparing the optimal points found by the GA with known experimental data under a variety of multi-criteria conditions. 1993 * 344(<-404): Multiple criteria optimization-based data mining methods and applications: a systematic survey Support Vector Machine, an optimization technique, is well known in the data mining community. In fact, many other optimization techniques have been effectively used in dealing with data separation and analysis. For the last 10 years, the author and his colleagues have proposed and extended a series of optimization-based classification models via Multiple Criteria Linear Programming (MCLP) and Multiple Criteria Quadratic Programming (MCQP). These methods are different from statistics, decision tree induction, and neural networks. The purpose of this paper is to review the basic concepts and frameworks of these methods and promote the research interests in the data mining community. According to the evolution of multiple criteria programming, the paper starts with the bases of MCLP. Then, it further discusses penalized MCLP, MCQP, Multiple Criteria Fuzzy Linear Programming (MCFLP), Multi-Class Multiple Criteria Programming (MCMCP), and the kernel-based Multiple Criteria Linear Program, as well as MCLP-based regression. This paper also outlines several applications of Multiple Criteria optimization-based data mining methods, such as Credit Card Risk Analysis, Classification of HIV-1 Mediated Neuronal Dendritic and Synaptic Damage, Network Intrusion Detection, Firm Bankruptcy Prediction, and VIP E-Mail Behavior Analysis. 2010