Data extraction from txt file with changing number of columns
3 views (last 30 days)
Show older comments
Hi everyone,
I am running simulations in a software where the output is essentially a massive amount of data in .txt files. I am trying to extract and plot the data using matlab. This is as far as I have made it, with the trial data set for 10 Monte Carlo runs. The problem is that the number of Monte Carlo runs vary so the output which is in coloumns that I will attach below varies as well. Is there maybe an alternative way to extract the data thats more proficient than what I am doing right now? or is there maybe a loop that could make it faster?
clc; close all; clear all;
%import data from DAMAGE
fid = fopen('All-In-19-QUICK_EVAL.txt');
%file 1 - lifetime
lifetime = fscanf(fid, '%s %s %s %s', [1 4]);
lifetimedata = fscanf(fid,'\n %f %f',[2 24]);
lifetimedata';
proportion = lifetimedata(1:2:39);
countlifetime = lifetimedata(2:2:40);
figure(1)
plot(proportion, countlifetime,'b');
title('Lifetime Proportion')
xlabel('proportion [.1%]'); ylabel('count'); xlim([0 1]);
%file 2 - altitude
altituded = fscanf(fid, '\n %s %s %s', [1 3])
altitudedata = fscanf(fid, '\n %f %f', [2 36]);
altitudedata'
altitude = altitudedata(1:2:72);
countaltitude = altitudedata(2:2:72);
% figure(2)
% plot(altitude, countaltitude,'b');
% title('Altitude');
% xlabel('altitude [km]'); ylabel('count'); xlim([0 2000]);
%file 3 - cumulative number of collisions - this section is my main problem
file3 = fscanf(fid, '\n %s %s %s', [1 30]);
file3data = fscanf(fid, '\n %f %f %f', [14 1800])
file3data'
year3 = file3data(1:14:1414);
run1in3 = file3data(2:14:1414);
run2in3 = file3data(3:14:1414);
run3in3 = file3data(4:14:1414);
run4in3 = file3data(5:14:1414);
run5in3 = file3data(6:14:1414);
run6in3 = file3data(7:14:1414);
run7in3 = file3data(8:14:1414);
run8in3 = file3data(9:14:1414);
run9in3 = file3data(10:14:1414);
run10in3 = file3data(11:14:1414);
mean3 = file3data(12:14:1414);
negStandardDev3 = file3data(13:14:1414);
posStandardDev4 = file3data(14:14:1414);
i have attached the .txt this evaluates. NOTE: code is only extracting data from the first 3 data blocks, ideally i would want a way to extract all data.
1 Comment
dpb
on 3 Jul 2022
It is easy enough to read each section; the problem is they left no trail of bread crumbs through the forest/file to use in reading the file as for number of variables, number of observations, etc., etc., etc., ... so you've got to parse the file to find all of those things --
I'd probably start a file like this by reading the whole thing into a string array and finding the empty lines as the sections -- then you'll need a corollary database of the section types and what is the unique section heading that can be used to distinguish which section you're reading.
It's all doable and not terribly difficult, but the tedium factor is immense in working through all the details.
Answers (1)
per isakson
on 22 Jul 2022
Edited: per isakson
on 22 Jul 2022
The file All-In-19-QUICK_EVAL.txt contains 73 sections. Each section
- starts with two consequtive empty rows followed by
- in the final 27 sections a simulation case ID, e.g. 2028_86, followed by
- a descriptive header row followed by
- a column header row followed by
- a block of numerical data
I noticed that
- neither the header nor the column header provides names, which are legal in Matlab
- not all section types have unique headers, e.g. more than one section type has the header, Number of Objects > 10 cm
- there are nine (not ten) unique simulation case ID
I created a function to read and parse All-In-19-QUICK_EVAL.txt. It's a first attempt. The data of each section is stored in one structure. (73 structures in total.) The names of the structures are modified versions of the headers and the names of the fields are modified versions of the column headers. (That's simple to implement.) To create dynamically named structures I use the function, matfile. Thus acknowledging: TUTORIAL: Why Variables Should Not Be Named Dynamically (eval).
avdb_3( 'All-In-19-QUICK_EVAL.txt', 'test.mat' )
load test.mat
whos
With help of tab-completion and copy&paste it's possible to use these names.
plot( LifetimeProportion.Proportion, LifetimeProportion.Count,'b');
title('Lifetime Proportion')
xlabel('proportion [.1%]'); ylabel('count'); xlim([0 1]);
grid
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!