Clear Filters
Clear Filters

How to define range limits to classify values with an optimal success percentage?

3 views (last 30 days)
Maybe my question isn't so clear, but l will explain all here:
I have 2 input datasets of values: GREEN & RED (total of 90 values from 0.3 to 0.6)
I want to classify those values in HEALTH, DOUBT and SICK defining some range limits (e.g., x < 0.4; 0.4 < x < 0.5; x > 0.5) so that most of HEALTH values fall on GREEN and SICK on RED (the remaining are DOUBT).
My final goal is to automatically maximize the success percentage of the model, so finding the optimal values of these limits.
I was able to do it manually, inserting limits and evaluating the success percentage (it wasn't the optimal result), so I wonder if there is an automatic way to do it.
Do you have any suggestion?
Thank you in advance for you time and your help.

Answers (1)

Shubham
Shubham on 28 Feb 2024
Hi Matteo,
To automate the process of finding the optimal threshold values that maximize the success percentage of classifying your GREEN and RED datasets into HEALTH, DOUBT, and SICK categories, you can use a computational approach.
  1. This function should take the lower and upper thresholds as inputs and the datasets for GREEN and RED values. It should output the success percentage, which is the proportion of values correctly classified as HEALTHY (most GREEN values below the lower threshold), DOUBT (values between the thresholds), and SICK (most RED values above the upper threshold).
  2. Choose an optimization algorithm that MATLAB supports, such as fminsearch, ga (Genetic Algorithm), or particleswarm for more complex optimizations. For a grid search, you could simply use nested loops to iterate over a range of threshold values.
  3. Set the feasible ranges for the lower and upper thresholds. These ranges are based on your data and the initial conditions you provided (e.g., 0.3 to 0.6). You must ensure that the lower threshold is less than the upper threshold.
  4. Run the optimization algorithm over the specified ranges. The algorithm will evaluate the success percentage for each pair of thresholds and search for the pair that maximizes this percentage.
  5. After the optimization, the algorithm will return the pair of threshold values that yield the highest success percentage. These values are your optimal thresholds for classifying the data into HEALTH, DOUBT, and SICK.
  6. If you have additional data, validate the chosen thresholds against this new dataset to ensure that the model's performance is consistent.

Categories

Find more on Problem-Based Optimization Setup in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!