Genetic Algorithm. About ga function in matlab and feature selection

6 views (last 30 days)
I am using Genetic algorithm for feature selection. I used the built-in function of matlab
In the above function from where I will get the selected features? what is the value stored in x? For what it was used? Whether I have to include other steps for feature selection?What will be the nvars?Is it the number of features inputed to genetic algorithm?Kindly clarify..Thanks in advance

Accepted Answer

Walter Roberson
Walter Roberson on 13 Apr 2022
The function fun that you pass in, is responsible accepting trial vectors of model parameters, and evaluating the "cost" associated with those particular parameters. The return value, x, that gets returned, is the vector of model parameters that resulted in the lowest "cost".
In the case of feature selection, the trial parameters you pass in could potentially include a vector of integer decision variables, restricted to 0 or 1, with 0 meaning that the corresponding feature is not selected, and 1 meaning that it is selected. To select no more than N features, you could add a linear constraint that the sum of those decision variables is <= N.
As the return is the trial parameters that resulted in the lowest cost, then if you did integer decision variables like I describe, then that section of the output vector would tell you which features were selected (1) or not (0)
The return value from ga() does not inherently tell you about selected features: you have to arrange your function so that the set of selected features can be calculated from the inputs, and then after when you get the best parameters out, use them to say which features were selected.
Another approach instead of binary decision variables would be to use a vector of integer constrained variables, each between 1 and the number of features, effectively listing off which are selected.
nvars is the total number of model parameters that are to be varied. The output will be of the length indicated by nvars. This is not necessarily the same as the number of features, since you might have extra variables not being used as decision variables, or you might have chosen to encode by feature number instead of by binary decision variables.
  13 Comments
Little Flower
Little Flower on 6 Jul 2022
Thank you.. I have another doubt..Which test is suitable to find the statistical significance between features??
Little Flower
Little Flower on 6 Jul 2022
I have a dataset of size (m,n) matrix. Here is the number of obeservation and n is the number of features. These m samples lie in c classes. For eg. c=4. Now my question is can i do any of the statistical significance test for giving the whole matrix as input or i have to seperate it in terms of classes. Which method is preferred in my case?. Whether it is student t test or anova or some other test?

Sign in to comment.

More Answers (0)

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!