low weighted cross entropy values

9 views (last 30 days)
Ève
Ève on 19 Aug 2025
Commented: Ève on 21 Aug 2025
I am building a network for semantic segmentation, with a weighted cross entropy loss. It seems possible to add weights related to my 8 classes ( inverse-frequency and normalized weights for each class) with the crossentropy() function. My issue is that the loss values that are calculated during training seem to be lower than what i should expect (values are between 0 and 1 but I would have expected them to be between 2-3).
My class weights vector is
norm_weights =[ 0.0011 0.4426 0.0023 0.0037 0.0212 0.0022 0.0065 1.0000]
And this is how I implement my loss function:
lossFcn = @(Y,T) crossentropy(Y,T,norm_weights,WeightsFormat="UC",...
NormalizationFactor="all-elements",ClassificationMode="multilabel")*n_class_labels;
[netTrained2, info] = trainnet(augmented_ds,net2,lossFcn,options);
If anyone would have a clue about the issue, that would be helpful!
  3 Comments
Ève
Ève on 19 Aug 2025
I am reproducing a network from a research paper. My network architecture & training options are the same. My data is also from the same database. In their loss graphs, the initial loss values during training are between 2 and 3 so I assumed that this should also be the case for my network. When I use the crossentropy function without weights, such as:
[netTrained1, info] = trainnet(augmented_ds,net1,'crossentropy',options);
I do get higher loss values than when I 'personnalize' my crossentropy loss function so that it has weights
Ève
Ève on 19 Aug 2025
I reproduced the methodology from this research article as closely as I could, including how they format their network input and such. I am questionning wether there is a problem with my loss function or not because the loss values that I obtain are actually very small. I said that they were between 0 and 1, and I should have specified that they actually currently gravitate around 0.026502. I know that the goal is for the loss to tend towards zero, but my network isn't trained (I reproduced a SegNet architecture), my training accuracy is around 20%, so the loss values seem very low to me.

Sign in to comment.

Accepted Answer

Matt J
Matt J on 19 Aug 2025
There are a few possible reasons for the discrepancy that I can think of,
(1) Your norm_weights do not add up to 1
(2) You have selected the NormalizationFactor="all-elements" in crossentropy(). According to the doc, though, trainnet does not normalize with all elements. It ignores the channel dimensions
(3) Other hidden normalization factors that may be buried in the blackbox that is trainnet(). I don't know if it is possible or worthwhile trying to dig them out.
  7 Comments
Ève
Ève on 21 Aug 2025
I understand, I'll try your suggestions. Thanks a lot for the feedback, I really appreciate it.
Ève
Ève on 21 Aug 2025
I'll accept your answer because as you suggested, my loss values are low simply because they reflect the scale of my weights, which are for the most part very small values. I may revise the way I calculate them. I'll also add for anyone reading this that I was wrong about the ClassificationMode in my lossFcn; for my type of classification problem, it should be set to "single-label" (default). I left the rest of the function the same.

Sign in to comment.

More Answers (0)

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!