Why does unicode2native returns different values on different Matlab versions?

1 view (last 30 days)
I use 2 Matlab versions 2020b and 2022b. With eachversion, the function unicode2native returns differently for 1 character.
On 2020b, it returns 26.
While in 2022b, it returns an array of 3 uint8.
ans = 1×3
239 191 189
Why does this happen?

Accepted Answer

Rik on 24 Feb 2023
I couldn't find this in the release notes, but apparently the default encoding changed between these versions (the documentation claims it choses the "user default encoding").
You can reproduce the results when specifying the target encoding:
unicode2native(char(65533),'ISO-8859-1') % picked on R2020b
ans = uint8 26
unicode2native(char(65533),'UTF-8') % picked on R2022b
ans = 1×3
239 191 189
In general I think people expect UTF-8 when converting char to uint8, so the new default makes sense.
Rik on 24 Feb 2023
@Steven Lord Thank you for confirming the section in the RN is indeed related to this. I did read that part, but I didn't fully understand it, since it looked to me as if this would have been the expected behavior since R2020a ("UTF-8 was adopted as MATLAB's default character encoding to ensure that all Unicode code points can be correctly represented in files and byte streams." - RN 2020a).
But in retrospect it makes sense that there is some change in behavior. Why else mention it in the release notes? (and on close inspection, there is a qualification for Windows: "On the Windows platform, if the Use Unicode UTF-8 for worldwide language support option is enabled in the Windows Region settings dialog box, then MATLAB uses UTF-8 as its system encoding.")
Anyway, the mystery seems to be solved.

Sign in to comment.

More Answers (0)


Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!