Hi @NCA,
To address your query regarding, “I am using One Class Support Vector Machines for anomaly detection. Here is the anomaly scores histogram (attached) for the model trained with 274 samples and tested with 31 samples. How do I determine the true/false prediction rates from the anomaly scores histogram. “
Please see my response to your comments below.
First, I generated synthetic data, as you can see in the code, rng(1) command sets the random number generator seed to 1, by making sure that the results can be reproduced, trainData generates 274 samples from a standard normal distribution (mean = 0, variance = 1) for training and testData creates a test dataset consisting of 31 normal samples and 10 anomalies (shifted by 5 units on the x-axis).
rng(1); % For reproducibility
trainData = randn(274, 2); % 274 samples for training
testData = [randn(31, 2); randn(10, 2) + 5]; % 31 normal samples and 10 
anomalies
Then, created labels for training data which creates a label vector for the training data, where all entries are set to 1, indicating that all training samples are considered normal.
trainLabels = ones(size(trainData, 1), 1); 
Now, training one class SVM is implemented in which fitcsvm function trains a One-Class SVM model using the training data and labels, KernelFunction', 'gaussian’ specifying the use of a Gaussian kernel for the SVM, ’Standardize', true normalizes the data to have zero mean and unit variance and ‘ClassNames', [1; -1] defines the class labels for the model.
ocsvmModel = fitcsvm(trainData, trainLabels, 'KernelFunction', 'gaussian', 
'Standardize', true, 'ClassNames', [1; -1]);
Afterwards, predicting anomaly scores for test data which uses the trained SVM model to predict labels and scores for the test data. The score variable contains the anomaly scores, which indicate how likely each sample is to be an anomaly.
   [predictedLabels, score] = predict(ocsvmModel, testData);
Then, created subplots first histograms in which a figure with two subplots is created. The first subplot displays a histogram of the anomaly scores for the test data while the second subplot shows the histogram of the anomaly scores for the training data.
figure;
% Subplot for test data
subplot(2, 1, 1);
histogram(score(:, 2), 30, 'FaceColor', 'b', 'FaceAlpha', 0.5);
title('Anomaly Scores Histogram - Test Data');
xlabel('Anomaly Score');
ylabel('Frequency');
% Subplot for training data
subplot(2, 1, 2);
trainScores = predict(ocsvmModel, trainData);
trainAnomalyScores = trainScores(:, 1); % Get anomaly scores for training data
histogram(trainAnomalyScores, 30, 'FaceColor', 'r', 'FaceAlpha', 0.5);
title('Anomaly Scores Histogram - Training Data');
xlabel('Anomaly Score');
ylabel('Frequency');
Afterwards, determining true/false prediction rates in which a threshold of 0 is set to classify scores as anomalies. Scores greater than this threshold are considered anomalies. Also, the trueLabels vector is created to represent the actual labels of the test data.
   threshold = 0; % Set threshold for anomaly detection
    predictions = score(:, 2) > threshold; % True if score indicates anomaly    % True labels: 1 for normal, -1 for anomaly
    trueLabels = [ones(31, 1); -ones(10, 1)]; Then, I implemented code to calculate true positive, false positive, true negative and false negative based on the predictions and true labels.
TP = sum(predictions(trueLabels == -1)); % True Positives
FP = sum(predictions(trueLabels == 1));  % False Positives
TN = sum(~predictions(trueLabels == 1)); % True Negatives
FN = sum(~predictions(trueLabels == -1));% False Negatives
The true positive rate (sensitivity) and false positive rate are calculated to evaluate the model's performance.
truePositiveRate = TP / (TP + FN);
falsePositiveRate = FP / (FP + TN);
Finally, the true positive and false positive rates are printed to the console, providing insight into the model's effectiveness in detecting anomalies.
fprintf('True Positive Rate: %.2f\n', truePositiveRate);
fprintf('False Positive Rate: %.2f\n', falsePositiveRate);
Please see attached.


Please let me know if this helped resolve your problem. Please let me know if you have any further questions.