how to handle 'nested' arrays
26 views (last 30 days)
Show older comments
Hello, I am new to MatLab, and am having a problem with code like this:
x=["bar" "foobar" "food" "bard"];
y=regexp(x,'.*foo.*','match');
X is a string array, but the problem is y seems to be a cell array of string arrays- which I can't manipulate or use at all. So what I want to do is "simplify" y so that it is also a simple string array, by taking the first element of each element of the cell array. I.E., output should look like:
["" "foobar" "food" ""]
Any general references on how to handle and manipulate nested arrays of arrays would also be appreciated.
0 Comments
Accepted Answer
Guillaume
on 9 Oct 2017
In this particular instance, the easiest way to solve your problem is to add the 'once' option to your regexp:
matches = regexp(x, '.*foo.*', 'match', 'once');
6 Comments
Cedric
on 11 Oct 2017
Edited: Cedric
on 11 Oct 2017
MATLAB data structures and indexing are not completely trivial topics, especially in the beginning.
Indexing using () is called block indexing and it returns a block of whatever is indexed. Block indexing a numeric array returns a (sub-) numeric array, and block indexing a cell array returns a (sub-) cell array. For cell arrays, there is the {} indexing that accesses the content of cells. If C is a cell array, C(2,1) indexes cell with index (2,1), and C{2,1} indexes its content:
>> C = {'John', 'Smith', 32; 'Dana', 'Doe', 57} ;
>> C(2,1)
ans =
1×1 cell array
{'Dana'}
>> class( C(2,1) )
ans =
'cell'
>> C{2,1}
ans =
'Dana'
>> class( C{3} )
ans =
'char'
Then arrays with multiple dimensions can be indexed in a subscript fashion (e.g. C{1,3}) or linearly following the column-first structure in memory (e.g. C{5} for the same element), and with numeric indices or logicals (flags, logical indexing):
>> A = randi( 10, 2, 3 )
A =
9 2 7
10 10 1
>> A > 5
ans =
2×3 logical array
1 0 1
1 1 0
>> A(A>5)
ans =
9
10
10
7
>> linpos = find(A>5) % Return linear index, not subscripts.
linpos =
1
2
4
5
>> A(linpos)
ans =
9
10
10
7
Finally, accessing the content of multiple cells of cell arrays returns a comma separated list (CSL):
>> C{1:2, 1}
ans =
'John'
ans =
'Dana'
which is equivalent to evaluating
>> C{1,1}, C{2,1}
ans =
'John'
ans =
'Dana'
Now you probably recognize the comma-separated list typical to passing arguments in function calls. This is the primary use of CSLs in fact, for developing the content of cell arrays into a list of arguments during a function call. When I write:
horzcat( matches{:} )
it is the same as
horzcat( matches{1}, matches{2}, .. )
for concatenating contents, but I don't have to write the CSL explicitly, which would be a problem because the size of matches varies.
This summary is a pretty exhaustive overview of the main indexing techniques in MATLAB. It's missing structs, struct arrays, and dynamic field names (i.e. S.('myField') interpreted as S.myField, with the advantage that 'myField' can be the content of a char variable). With this addition, you would have an exhaustive list of the classic data structures in MATLAB.
However, Mathworks introduced new data structures lately: tables, strings, etc. Strings were introduced in 2016b/2017a (double quotes) in fact. The main difference with the simpler char arrays (the classic way of storing "strings") is that they are objects (char is a class but you cannot explicitly access methods):
>> cs = 'hello'
cs =
'hello'
>> s = string(cs)
s =
"hello"
>> methods(s)
Methods for class string:
cellstr eq gt lower reverse upper
char erase insertAfter lt sort
compose eraseBetween insertBefore ne split
...
which means that they incorporate methods for working on them that can be quite powerful, and the consistency of OOP.
Guillaume
on 11 Oct 2017
which is best to use
That's a very easy question to answer: if you're on a version (R2016b or later) that support it and don't care about older versions, use string arrays. They're a lot more intuitive and powerful. strings didn't exist before R2016b so on earlier version char arrays where the only option.
You have to think of cell arrays as a container for any kind of things. The containers has cells where you put each things. You can look at the container itself, for example you can take the container that encompass rows 1:3 and columns 2:4. For that you use (), and you get another container:
container = cell(10, 20);
container_portion = container(1:3, 2:4); %container_portion is still a cell array
You can also looks at what's inside the container. That's when you use {}, you get whatever is inside each cell
container = cell(10, 20);
container{1, 5} %returns the content of cell(1,5)
You can ask for the content of several cells at once with {} and matlab returns a comma-separated list of the contents (see Cedric's Link), so if you do
matches{:}
you get a c-s-l of all the string arrays. You can pass that c-s-l to horzcat, which simply concatenate horizontally all its inputs, thus
horzcat(matches{:})
extracts the content of all the cells of the cell array and converts into a row vector of this content.
More Answers (1)
jean claude
on 9 Oct 2017
Edited: jean claude
on 9 Oct 2017
hi maho, i didn't understand what do you want exactly, but you can read here https://www.mathworks.com/help/matlab/ref/strsplit.html may be it helps!
0 Comments
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!