writestruct does not reproduce the input of readstruct

16 views (last 30 days)
I find it disturbing that applying writestruct on a structure created with readstruct does not reproduce the original input: see example below.
Is there a way to fix this? Readstruct is very convenient for reading and modifying lighweight xml file (in my case xdmf file to be opened by Paraview) but this issue is in my opinion a major flaw.
Thanks for any advice.
Adrien.
>> type input.xml
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
<Tag2 Name="foo">
1 2 3
</Tag2>
</Tag1>
>> h = readstruct("input.xml","structnodename","Tag1");
>> writestruct(h,"output.xml","structnodename","Tag1")
>> type output.xml
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
<Tag2 Name="foo">
<Text>1 2 3</Text>
</Tag2>
</Tag1>
>>
  1 Comment
Stephen23
Stephen23 on 16 Feb 2023
Edited: Stephen23 on 17 Feb 2023
"writestruct does not reproduce the input of readstruct"
Nor is this expected: there is a large set of input files which will produce exactly the same structure once imported. The XML standard specifically states that a lot of file formatting (e.g. different whitespace) and "irrelevant" XML formatting (e.g. attribute order) is not signficant and should be considered equivalent. In your specific example, note that XML elements may contain text, attributes, other elements, or any mix of these:
Your "1 2 3" are themselves not elements or attributes, so must be text. MATLAB is semantically correct.
"...but this issue is in my opinion a major flaw."
Your proposal is impossible: in general there is no way to know which exact XML file generated a particular structure when imported into MATLAB (or any other application). The XML standard specifically states that this should not be possible.
This applies not only to READSTRUCT/WRITESTRUCT, but every other "pair" of import/export functions, e.g. READMATRIX accepts an uncountably large set of input files, which WRITEMATRIX cannot reproduce from the matrix alone. This is a necessary corollary of applying Postel's law:

Sign in to comment.

Accepted Answer

Jeremy Hughes
Jeremy Hughes on 16 Feb 2023
Unfortunately, there's no way to get that to happen 100% of the time, and readstruct/writestruct are not meant to do that. In fact, round tripping from a data source to any other representation and back again is seldom fully round-tripable unless the two systems were designed together to be that way.
In this case, in order to be able to do document transformation, you have to know far more than what a MATLAB struct is capable of storing. E.g. whether the data was in an attribute, or part of the text node as in your case, or in a node named "Text" to begin with.... or this thing:
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
<Tag2 Name="foo">
<Text>1</Text>
<Text>2</Text>
<Text>3</Text>
</Tag2>
</Tag1>
To get the kind of fidelity for a round trip of XML across all the valid XML files, you necessarily lose some simplicity. Essentially, there's not a 1:1 mapping between MATLAB structs and XML. To get that and still have some sembalance of usability you have to move to objects.
Good(ish) news, there is a MATLAB XML DOM: https://www.mathworks.com/help/matlab/import_export/importing-xml-documents.html, and it (and similarly xmlread) will allow you to do anything, but at a cost of both a steeper learning curve, and more code complexity. Note that the XML DOM APIs are not designed by MathWorks; here's the spec: https://www.w3.org/TR/WD-DOM/
... but I don't recommend the read unless you're having trouble sleeping.
Basically, it's not a simple problem. XML is a very complex format, and working with it can be painful. readstruct/writestruct are really about getting some MATLAB data into that format and back again, and not the other way around, though readstruct should allow you to get data into MATLAB and work with it for general XML files.
  3 Comments
Adrien Leygue
Adrien Leygue on 21 Feb 2023
Thank you Jeremy for this answer. I know that the problem is not easy due to the flexibility of XML w.r.t. Matlab structures. It is just very frustrating because I was hoping to have a simpler (I am currently using the Matlab XML DOM) way to output my results and append data a posteriori. My secret hope was a hidden option similar to "AttributeSuffix" that would allow reading implicit text elements and writing them also implicitly. Maybe I'll just program this myself (the files I am dealing with are not that complex).
Thx for your answer and time.
Adrien.
Jeremy Hughes
Jeremy Hughes on 21 Feb 2023
Thanks for the info Adrien,
I'll put in an enhancement request for an option to handle "Text" fields on writing.

Sign in to comment.

More Answers (0)

Categories

Find more on External Language Interfaces in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!