xml text block that is actually floats/single - how to convert character array to single array

1 view (last 30 days)
I have an XML file with lots of nested data. One of the fields produces a single array of 1656 values that are stored in the XML as text. I can get the text as a string variable (a 8948 character array). How can I tell matlab to interpret this array as a 32 bit float array of 1656 values?
if char_data is the character array
I have tried: value_data = single(char_data);
but that converst each character to it's ascii value.
I have also tried: value_data = cast(char_data,'single');
and value_data = typecast(char_data,'single');
but all of these just convert the individual characters to their ascii values as single types.
it would seem I need to convert it to binary and tell matlab to interpret the block as the single array - but I am not seeing the functions to do that. I could write the text to a file, then read it with fread telling it the data type, but it would seem there should be a more direct way to do this.
Here is an example of the XML file:
<OPUSDataFile><AB__Multiple><Ydata ylabel="Absorbance Units" label="" block="0"><values byteorder="INTEL" format="FLOAT32" numvalues="1656">SX6TvL5Sk7yERZO8NE+TvDlNk7wZMZO8/RKTvFfskryPvZK8NLKSvDvOkrwNDZO8fkyTvIRtk7xJdJO87k+TvKk3k7wuR5O8iS+TvOcBk7yB3JK81tqSvKk3k7xSypO8R/GTvF5xk7zUcpO8wfWTvDi2k7yIKpO8jh6TvFO2k7z6bpS8QzeUvJSPk7zZiZO8XnuTvIgWk7y0jZO8cpaUvDSBlLwOspO8+VqT ...
  5 Comments
Stephen23
Stephen23 on 22 Jan 2021
Edited: Stephen23 on 22 Jan 2021
@Brent: what do the characters represent? (in other words, what is the encoding scheme for them?)
If the encoding simply used the ASCII (one byte) value of the character then I might expect to see a fairly random selection of control characters and non-letter characters, but the string seems to consist solely of alphabetic, numeric, plus, and forwardslash characters. This subset implies some mapping, which is currently not defined.
If we get the complete set of alphanumeric + plus + forwardslash then this gives 64 characters, which is a bit of a coincidence:
str = 'SX6TvL5Sk7yERZO8NE+TvDlNk7wZMZO8/RKTvFfskryPvZK8NLKSvDvOkrwNDZO8fkyTvIRtk7xJdJO87k+TvKk3k7wuR5O8iS+TvOcBk7yB3JK81tqSvKk3k7xSypO8R/GTvF5xk7zUcpO8wfWTvDi2k7yIKpO8jh6TvFO2k7z6bpS8QzeUvJSPk7zZiZO8XnuTvIgWk7y0jZO8cpaUvDSBlLwOspO8+VqT';
vec = unique(str) % a few characters are missing
vec = '+/01235678BDEFGIJKLMNOPQRSTUVWXZabcdefghijklnpqrstuvwxyz'
str = ['+/','0':'9','a':'z','A':'Z']; % all of them
numel(str)
ans = 64
Brent
Brent on 22 Jan 2021
Thanks for the help - I can't find any guidance from the instrument manufacture or other online searches on how to decode this information. I thought the 32 bit, byte order was sufficient, but it seems not so much. I have found an alternate work around using a different file format, so for now I have a solution, but not how to decode this data.
I also got a hex editor and looked at the data and didn't see any pattern that I could identify. I think you are right that it is not just a binary dump, something else is being used.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!