Text file parse and reformat and storage

1 view (last 30 days)
Andrew Smelser
Andrew Smelser on 4 Apr 2019
Answered: Guillaume on 4 Apr 2019
I have a text file that I want MATLAB to process. The file has HTML-like tags ( < open > opens the tag < /open > closes the tag ). I want MATLAB to read the file and when it encounters an open tag I want the < tag > information inside the tag < /tag > to be saved to a struct (that's the easy part) and then I want the raw source text passed through to an output text file with the only difference being adding a line return ("press enter key").
So basically it's a jumbled mess of metadata and the information I want. I want to reformat it while keeping the file contents intact. I've attached a document which shows an example of what I have, what I want the output text file to look like, and some other things. I'm currently on my mobile phone so I don't have any code on me that I have tried on me but can post it later.

Answers (1)

Guillaume
Guillaume on 4 Apr 2019
Your file is an xml file, so it should be parsed with an xml parser. Parsing it as text or html will always be fragile. Thankfully, you already have an xml parser in matlab, see xmlread.
xmlread returns a Java DOMnode object that allows you to navigate the xml structure in different ways. While it's very powerful, it's also not particularly intuitive and can be quite daunting so you may want to use this xml2struct FileExchange entry instead (or in addition). This will give you more or less the structure you desire.
As for your reformatting, I don't particularly see the point. Both are exactly the same xml and any decent code that parses xml will ignore whitespace anyway. If it's for human consumption, then a) xml is not designed for human consumption, b) there are plenty of XML beautifiers you can download.

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!