Testing unlabeled data on a trained model
Show older comments
Dear Matlab community,
I need to know if there's a way to test the reliability of predictions made by classifying new data (unlabeled data) using and already trained model.
This is what I did:
1) Create a dataset with labeled data, with 2 predictors and 3 response variables (training set);
2) Fit and validate a Multiclass Support Vector Machine classifier using the training set;
3) Use the obtained model to make predictions on a new dataset with unlabeled data (test set)
I would like to know which are the classification metrics (if there are) to establish the relaibility of this classification, since the new data is unlabeled.
Thanks.
4 Comments
Tarunbir Gambhir
on 29 Oct 2020
Reliability of the predictions made by a trained model is generally done using a test set which is labeled.
I suggest you split your labeled dataset into train, valid and test datasets. The train dataset is used for training the model, the valid dataset is used for tuning the hyperparameters of a model, and finally the test dataset will give you the performance or reliability of your final trained model.
Amanda
on 29 Oct 2020
Tarunbir Gambhir
on 29 Oct 2020
If your labeled training data and the unlabeled test data have a high correlation, the best thing you can do is to use a small partition of the labeled training data as test data to get a quantitative measure on reliability. The high correlation should ensure similar performance with your unlabeled test data.
Apart from this, I don't think there is any reliable way to get performance of your model on real data without ground truth.
Amanda
on 29 Oct 2020
Answers (0)
Categories
Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!