Textscan import string data from .txt file

Question

Linus Dock on 13 Nov 2021

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/1585719-textscan-import-string-data-from-txt-file

Commented: dpb on 17 Nov 2021

Accepted Answer: dpb

202103.txt

Open in MATLAB Online

Hi!

When I'm using textscan to read my data I get all the data but it's not quite organized the way I would like.

I will attach a sample .txt file.

I'm using this code to import the data:

%Imports all .txt files according user input time and convert into strings
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
    filename = sprintf('%s.txt',w(h,:)); %add .txt to year and month
    fileID = fopen(filename); %open filename to create fileID
    Data{h} = textscan(fileID,'%s','delimiter','\n'); %read all characters in fileID 
    fclose(fileID); %close fileID
end

What I would like to achieve is a string starting with METAR ESXX and with varying ending (for ex. Q1011 or R08/750135 or other).

I've tried using different delimiters but I get more or less the same result with the different delimiters.

It seems to be some problem when the data is not delimited by a newline what I can tell, but I can't find the right solution to get it working.

In a previous version of my code I was using fread but I understand that textscan is better to use. Is that correct?

Do you have any suggestions to what could be changed?

Thanks!

This a sample of the result of Data.

'M04/M06 Q1020

METAR ESKN 160020Z 31003KT 0300 R08/P2000N R26/1100N BCFG NSC'

'M04/M05 Q1010 R08/750135

METAR ESKN 160050Z 31003KT 5000 BR FEW064 M04/M04 Q1011 R08/750135

METAR ESKN 160120Z 31003KT CAVOK M03/M03 Q1011 R08/750135

METAR ESKN 160150Z VRB01KT 9999 FEW003 BKN061 M03/M03 Q1011'

'R08/750135

METAR ESKN 160220Z 32004KT 9999 SCT042 BKN055 BKN066 M02/M02 Q1011'

'R08/750135

METAR ESKN 160250Z 28003KT 9999 SCT003 BKN036 BKN057 M02/M02 Q1012'

'R08/750135

METAR ESKN 160320Z VRB02KT 9999 BKN002 M02/M02 Q1012 R08/750135

METAR ESKN 160350Z 33004KT 9999 BKN002 M01/M01 Q1012 R08/750135

METAR ESKN 160420Z VRB01KT 9999 BKN002 M01/M01 Q1012 R08/750135

METAR ESKN 160450Z 00000KT 4000 BR SCT003 M02/M02 Q1013 R08/710195

METAR ESKN 160520Z 30003KT 0300 R08/P2000N R26/0750U BCFG FEW003'

'SCT072 M03/M03 Q1013 R26/710195

METAR ESKN 160550Z VRB03KT 9000 SCT066 M03/M03 Q1013 R26/710195

METAR ESKN 160620Z 29003KT 9999 FEW002 BKN068 M02/M02 Q1013'

'R26/710195

METAR ESKN 160650Z 31003KT 9999 FEW002 SCT068 M00/M00 Q1014'

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

dpb on 13 Nov 2021

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1585719-textscan-import-string-data-from-txt-file#answer_830704

Open in MATLAB Online

Read the file as is and then clean it up instead...

d=readcell('202103.txt','Delimiter',newline);               % read a cellstr array
i1=find(~startsWith(d,'METAR'))-1;                          % locate first of line pairs
for i=1:numel(i1)                                           % and merge those by pair
  d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[];                                                 % then eliminate the second

Sanity check...

>> all(startsWith(d,'METAR'))
ans =
  logical
   1
>> 

9 Comments
Show 7 older commentsHide 7 older comments

Linus Dock on 16 Nov 2021

Open in MATLAB Online

Hello again!

Thanks for your help!

I can't get the readcell function to work with my version of Matlab 2018b.

I tried incorporating your suggestion into my code but I can't get it to function properly I'm afraid.

%Imports all .txt files according user input time and convert into strings
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
    filename = sprintf('%s.txt',w(h,:)); %add .txt to year and month
    fileID = fopen(filename); %open filename to create fileID
    Data{h} = textscan(fileID,'%s','delimiter','\n'); %read all characters in fileID 
    fclose(fileID); %close fileID
end
d=Data{:}{1};
%d=readcell('202103.txt','Delimiter',newline);               % read a cellstr array
i1=find(~startsWith(d,'METAR'))-1;                          % locate first of line pairs
for i=1:numel(i1)                                           % and merge those by pair
  d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[];                                                 % then eliminate the second

d is now just a 1x1 cell with the following content:

'METAR ESGG 010020Z 19007KT 0150 R03/0600N R21/0550N FG VV003 01/00 Q1030 R21/09//95

METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00'

The code seems to work with separating the groups judging by the return symbol in front of the METAR group below. But how do I get the output as separate cells containing one METAR line.

{'METAR ESGG 010020Z 19007KT 0150 R03/0600N R21/0550N FG VV003 01/00 Q1030 R21/09//95↵METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00' } {'Q1030 R21/09//95↵METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00' } {'Q1030 R21/09//95↵METAR ESGG 010120Z 21007KT 0150 R03/0500N R21/0500N FG VV002 01/00' } {'Q1030 R21/09//95↵METAR ESGG 010150Z 19007KT 0100 R03/0500N R21/0450N FG VV002 01/00'

dpb on 16 Nov 2021

Open in MATLAB Online

Oh. Unfortunately for you, readcell was introduced in R2019a.

I had difficulty with textscan, too...the input file contains \r at the end of each METAR line and \n after the short lines. That seemed to confuse all the past ways I've used to return records as cellstr inside textscan

My usual fallback in such cases is to return to the venerable (but deprecated) textread but it also failed with a (new to me) buffer overflow because it, too, apparently became confused by the disparate terminators.

So, before reverting to fegtl and loop (which isn't all that bad, actually, just a little more code to write, but less than your above loop), I tried the simple expedient of

>> d=importdata('202103.txt');
>> whos d
  Name          Size               Bytes  Class    Attributes
  d         77796x1             15658840  cell               
>> d(1:6)
ans =
  6×1 cell array
    {'METAR ESGG 010020Z 19007KT 0150 R03/0600N R21/0550N FG VV003 01/00'}
    {' Q1030 R21/09//95'                                                 }
    {'METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00'}
    {' Q1030 R21/09//95'                                                 }
    {'METAR ESGG 010120Z 21007KT 0150 R03/0500N R21/0500N FG VV002 01/00'}
    {' Q1030 R21/09//95'                                                 }
>> 

and joy ensues.

Now the previous join trick should work as expected.

Linus Dock on 17 Nov 2021

Open in MATLAB Online

Awesome! There was joy!

Importdata did the trick.

This is what worked for me

Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
  filename = sprintf('%s.txt',w(h,:));
  d=importdata(filename);
  i1=find(~startsWith(d,'METAR'))-1;
  for i=1:numel(i1)                 
    d(i1(i))=join(d(i1(i):i1(i)+1));
  end
  d(i1+1)=[];    % then eliminate the second
  Data{h}=d;    % will save into your large array
end     

Thanks a lot!

dpb on 17 Nov 2021

I was sure it would... :)

Glad to help; sorry didn't know were on earlier release initially...

Sign in to comment.

Textscan import string data from .txt file

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

9 Comments
Show 7 older commentsHide 7 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Textscan import string data from .txt file

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

9 Comments Show 7 older commentsHide 7 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

9 Comments
Show 7 older commentsHide 7 older comments