How to import data from attatched csv file?

In the attatched csv file there is a lot of clutter other than data, first 11 lines are not needed, so I used
T=readtable('data.csv','HeaderLines',11);
but in the data also, every data point has its unit GHz attached to it, which I dont want to import into my array, how can I stop that GHz from being imported. What more lines of code needs to be added. For ex: output from above readtable function is :
Var1 Var2
_____________ ______
{'1.000 GHz'} 200
{'1.100 GHz'} 200
{'1.200 GHz'} 200
{'1.300 GHz'} 200
{'1.400 GHz'} 200
{'1.500 GHz'} 200
{'1.600 GHz'} 199.99
{'1.700 GHz'} 199.99
{'1.800 GHz'} 199.99
{'1.900 GHz'} 199.99
{'2.000 GHz'} 199.99
It is taking var1 as string because of the presence of GHz, please someone help me remove that. Also explain how can I name change names of these var1 and var2 by using appropritae argument is readtable function.
I noticed another problem, in the var2 column the data is till 3 decimal places, but imported data is automatically rounded off to 2 decimal place, i don't want it to be rounded off. How to avoid that?
Thankyou in advance.

2 Comments

There's no attached csv file.
So sorry Chris, I added the question again and forgot to add to this one, now its attached.

Sign in to comment.

 Accepted Answer

You could try using the Format name-value pair. You might need to read about it in textscan for the specifics of how it works. This allows you to specify the format of the delimited values in each row.
Space gets treated as a delimiter here. Since there is a space between the number and GHz, it can be read in independent of the number and actually be ignored. Since we don't have your csv, I can't test it for your data, but try somethign like this.
T=readtable('data.csv','HeaderLines',11,"Format",'%f%*s%f')
From the sample data you shared, this is the result I get.
T =
Var1 Var2
____ ______
1 200
1.1 200
1.2 200
1.3 200
1.4 200
1.5 200
1.6 199.99
1.7 199.99
1.8 199.99
1.9 199.99
2 199.99

5 Comments

Thankyou Chris, the format pointers worked and removed the GHz :),
however there is still the problem of MATLAB rounding off the data to 2 decimal places, like the first 6 data points are not actually 200, but 199.998, 199.997, 199.997, 199.996, 199.995 and so on...
Also if you could tell how to change the names of var1 and var2 variables, I'd appreciate that very much.
Have you tested it with the actual data file? Remember, I didn't have the data file when I ran the sample code. I'd made up the data using what you showed in your original post, which was 200 for the first 6. When I rerun the exact same code on data.csv, I see internally MATLAB has read in the data correctly.
Inspect your variable in the Variable Editor (double click on the variable name in the Workspace to open it in the Variable Editor). When I do, I see desired resolution. It is only rounding the display.
1 199.998000000000
1.10000000000000 199.997000000000
1.20000000000000 199.997000000000
1.30000000000000 199.996000000000
1.40000000000000 199.995000000000
1.50000000000000 199.995000000000
1.60000000000000 199.994000000000
1.70000000000000 199.993000000000
1.80000000000000 199.992000000000
1.90000000000000 199.992000000000
2 199.991000000000
2.10000000000000 199.990000000000
2.20000000000000 199.989000000000
...
To change the varable names, you could do the following:
T5.Properties.VariableNames={'newVar1','newVar2'}
Oh you are right, its only rounding in the display, actual values have been correctly read. Did not know it showed rounded values in display, I am not an extensive MATLAB user.
T.Properties.VariableNames={'newVar1','newVar2'}
This also worked like a charm, thanks a lot Chris, learned some useful things today because of you.
Glad to help. One more follow up. It's only rounding with you display the entire table. Try displaying just var2 and it will print without rounding.
T.Var2
ans =
199.9980
199.9970
199.9970
199.9960
199.9950
199.9950
199.9940
199.9930
199.9920
199.9920
...
Oh Okay, Thankyou so much.

Sign in to comment.

More Answers (1)

In R2020a,
T = readtable('data.csv','TrimNonNumeric',true,'NumHeaderLines',11);
or in earlier releases,
opts = detectImportOptions('data.csv','TrimNonNumeric',true,'NumHeaderLines',11);
T = readtable('data.csv',opts);

5 Comments

I did not know about 'TrimNonNumeric'. Thanks for sharing!
Thankyou for your answer Jeremy, Chris's method worked fine, but to find out more techniques, I tried out your code as well. Since I have a 2019b release, I couldn't use the:
T = readtable('data.csv','TrimNonNumeric',true,'NumHeaderLines',11);
but was able to use this:
opts = detectImportOptions('data.csv','TrimNonNumeric',true,'NumHeaderLines',1);
T = readtable('data.csv',opts);
However there is a bit of a change that I had to do, I had to change the 'NumHeaderLines' argument to 1 instead of 11.
I can't seem to find exact definition but, I am guessing NumHeaderLines skips lines containing numeric values and HeaderLines skips lines in general.
Hmm, the NumHeaderLines thing seems like a bug. Reported.
Or maybe combined with TrimNonNumeric its acting that way.
Since we have already trimmed non numeric part we dont have to trim header again, we just have to trim numeric header. Thats why by using 1 as argument it works fine, because there only 1 numeric line in header part of my csv file.
Oh, I see, never mind. Not a bug exactly--just a quirk of CSV data. The first row begins with a double quote character.
CSV spec says that should be one field, containing the newlines and all... so there is only one line being interpreted by the parser. The first "row" considering quotes looks like this
"freq\nReferences : ['R']\nDependency : [freq]\nNum. Points : [91]\nMatrix Size : scalar\nType : Real", ...
"R\nReferences : []\nDependency : [freq]\nNum. Points : [91]\nMatrix Size : scalar\nType : Real"
If you leave off the NumHeaderLines it treats this as a row, and stripes away everything but the 91. and I see:
T = readtable('data.csv','TrimNonNumeric',true); head(T)
ans =
8×2 table
Var1 Var2
____ ______
91 91
1 200
1.1 200
1.2 200
1.3 200
1.4 200
1.5 200
1.6 199.99

Sign in to comment.

Categories

Products

Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!