How can I read from a file into a char array?
110 views (last 30 days)
Show older comments
I have a large text file, and I need to calculate the number of times each individual letter occurs in the file. The easiest way I can think of to do that would be to have an array where each entry is a single char from the file, then run an array function on the whole thing and sum the number of times each letter is found. However I am having trouble getting the text from the file into a char array. I have tried using fileread, which reads the entire file to a single entry in a string array, and I have tried using textscan, which reads the file into a cell array split up by words. Does anyone know if I can just get the file straight into a char array?
2 Comments
John
on 28 Sep 2014
Edited: John
on 28 Sep 2014
When you use fileread to read the text in a file you actually get a char array.
Let's say testfile.txt contains the text:
this is a test file
If you use fileread like this:
fileContents = fileread('testfile.txt')
fileContents will be a char array with the individual characters. Check that that is so with:
class(fileContents) %Should echo 'char'
isvector(fileContents) %Checks if fileContents is a vector, should return 1/true
The overall problems seems like a college homework assignment :-) so I will refrain from providing a solution. There are a couple of ways to do keep a count of each character in the char array. One way would be to keep count of the characters you encounter while iterating through the char array in a Map container, where the keys are the individual characters and the values are the populations of the unique characters in the char array.
Accepted Answer
per isakson
on 28 Sep 2014
Edited: per isakson
on 29 Sep 2014
Try
str = fileread( filespec );
num = double( str );
nch = histc( num, [1:255] ); % fix [32:255]
A little test - added later
>> char( find( histc( double('abcd1234'), [1:255] ) ) )
ans =
1234abcd
More Answers (1)
Geoff Hayes
on 28 Sep 2014
io_contents = ...
fullfile(matlabroot,'toolbox','matlab','iofun','Contents.m');
filetext = fileread(io_contents);
Note that filetext is a 1x4244 array of char elements. So you can either loop over each element and update your "counting" array, or try something else. Remember that each character has an ASCII code, so we could use that to our advantage. If we convert the character array into a numeric array, we could then use a histogram function (for example histc) to determine the counts for each character
charBinCounts = histc(double(uint8(filetext)),0:1:127);
So we take the 1x4244 character array filetext and then convert it to the 8-bit unsigned integers and convert to double (I needed to do both conversions because of histc). Then pass this numeric array to the histc function with the bins given by 0,1,2,...,126,127 (since unsigned 8-bit integers have values from 0 through to 127).
charBinCounts contains the counts for each character.
See Also
Categories
Find more on String Parsing in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!