plot_1feature_2clas​s

Visualises usefulness of a feature for binary classification.
360 Downloads
Updated 24 Nov 2014

View License

When you attempt to classify data in 2 classes based on a single feature, the first thing to do is to review the conditional probability of the point being from the first class given the value of the feature. However, as usual, overfit is creeping in the shadows. Therefore, the amount of data that the conditional probability is based on should be reviewed as well. This function plots on the same graph the conditional probability and histogram of data distribution for each class separately.
For example, let's assume that you are attempting to predict whether the patient will show paroxysmal atrial fibrillation during 24 hour ambulatory ECG recording (stored in boolean vector 'holter') based on value of peak ischemic ST depression during cardiac stress test (stored in vector 'ST').
Use the following commands to initialize the variables:
>> N = 1e3;
>> k = 20;
>> ST = [5+randn(1,N) 2*rand(1,k) 8+2*rand(1,k)];
>> holter = [2+6*rand(1,N)<ST(1:N) true(1,k-1) false(1,k) true];

Now you can visualise the data using the following commands:
>> plot_1feature_2class(ST, holter, 'ST, mm', {'PAF', 'no PAF'})

For additional explanations type: help plot_1feature_2class

This submission is courtesy of Norav Medical (www.norav.com) - the leading company in the fields of PC-ECG, EKG Management systems and related non-invasive cardiac devices.

Cite As

Mark Matusevich (2024). plot_1feature_2class (https://www.mathworks.com/matlabcentral/fileexchange/45580-plot_1feature_2class), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2009b
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Histograms in Help Center and MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.1

Bins are now calculated based on the range of all the feature values instead of only those of vClass==true. And the probability plot is based on the same bins as the histograms.

Also, include updates from dependents_plot (submission N. 45481).

1.0.0.0