Create empty rows in a cell, based in a condition

Hello everyone,
I have a cell (390x1) and it should repeat the words 'hello, 'hey', 'hi' and 'hoi' every 4 rows, as in the following:
Row 1 - 'hello'; Row 2 - 'hey'; Row 3 - 'hi'; Row 4 - 'hoi'; Row 5 - 'hello'; Row 6 - 'hey'; Row 7 - 'hi'; Row 8 - 'hoi'.
However, sometimes it goes like:
Row 9 - 'hello'
Row 10 - 'hey'
Row 11 - 'hoi'
And it skips the 'hi' row, making it so that I can't loop it using something like "for i = 1:4:390", which is what I need.
What I want to do is create the same cell where the "3rd row" of every "4 rows" is empty, that is: when the 3rd row has "hi" in it, those 4 rows stay the same. But when the "hi" row is missing, it creates an empty cell row between the 2nd and the 4th row (between the 'hey' and the 'hoi').
For example, it would be come:
Row 9 - 'hello'
Row 10 - 'hey'
Row 11 - ' ' (empty)
Row 12 'hoi'
I already tried doing this with loops and the strcmp function, but I really can't do it.
Please ask questions in case I didn't explain myself well enough.
Thank you

 Accepted Answer

Here's an efficient way, with a loop (cellfun) only over the missing elements, not the whole cell array. In addition if several contiguous entries are missing, it will have the required number of empty cells.
fillvalue = ''; %whatever you want to be put in the missing entries
valueset = {'hello', 'hey', 'hi', 'hoi'}; %the repeating set. Must be in the correct order!
%dataset: the cell array containing repeatition of valueset in the same order, but with some values missing
%e.g.
dataset = valueset(repmat(1:numel(valueset), 1, 5))'; dataset([8, 9, 12, 19]) = []; %demo dataset with 4 missing entries
%The following requires dataset to be a COLUMN cell vector.
[found, id] = ismember(dataset, valueset); %replace char vectors by numeric id (corresponding to their order in value set)
assert(all(found), 'dataset contains value not present in value set. Cannot continue');
deltaid = mod(diff(id), numel(valueset)); %complete difference between ids. If no entry is missing it should be all 1s.
%Non 1 entries in deltaid is where some elements are missing. The values of deltaid minus 1 indicates how many entries are missing
ismissing = deltaid ~= 1;
splitdataset = mat2cell(dataset, diff([0; find(ismissing); numel(dataset)])); %split cell array at locations where entries are missing. mat2cell despite its name also works with cell arrays
splitdataset = [splitdataset, arrayfun(@(nummissing) repmat({fillvalue}, nummissing, 1), [deltaid(ismissing)-1; 0], 'UniformOutput', false)]'; %append split data with appropriate number of missing marker and transpose
newdataset = vertcat(splitdataset{:})
%entries 8, 9, 12, 19 will have had fillvalues inserted
Note that this code will detect any missing element in the sequence, not just the 'hi', work for any length of input, for whatever text is in the set, and regardless of how many elements are missing.

1 Comment

This did exactly what I needed it to do! Thank you so much.

Sign in to comment.

More Answers (1)

Hello
I do not know why there should be rows that are skipped, unless you are copying and pasting and something goes wrong. I have been able to do what you wanted like this:
>> for k=1:4:390
a{k,1}='hello';
a{k+1,1}='hey';
a{k+2,1}='hi';
a{k+3,1}='hoi';
end
Notice that to have the answer as rows you have to specify the column {k,1} otherwise the default will fill a value per column. Now if you want to have a blank space you can do it like this:
>> for k=1:4:390
b{k,1}='hello';
b{k+1,1}='hey';
b{k+2,1}='';
b{k+3,1}='hoi';
end
I hope that this answers your question, if it does not, let me know. If it does, please accept the answer.

4 Comments

Hello,
That doesn't really solve my problem as the biggest problem is that I can't loop it using "k = 1:4:390" since sometimes the 3d row ('hoi') that I talked about is not even there.
The problem is that my 'hello/hey/hi/hoi' messages are followed by numbers which I didn't mention because I didn't think it was needed.
So basically I have (I wrote completely random numbers since there's no pattern in my data numbers):
  • Hello 3284832
  • Hey 943583
  • Hi 6739429
  • Hoi 4385834
  • Hello 1238124
  • Hey 842528
  • Hoi 742529
And what I want is to create a cell that is equal to that one but when there is a "hi" row, that row isn't changed at all, it keeps the 'hi' and the number. But when there's a "hi" row missing between the "hey" and the "hoi" row, it creates an empty row.
Hopefuly it's understandable enough.
Ok, so you already have the cell, and in some cases the info is, let's say "corrupted" and what you want is to "clean" it by introducing a position corresponding to the third in a group of four. So, going back to my example
>> for k=1:4:390
a{k,1}='hello';
a{k+1,1}='hey';
a{k+2,1}='hi';
a{k+3,1}='hoi';
end
That would be the correct set, and this will skip one "hi" in position 23
>> for k=1:22
b{k,1}=a{k,1};
end
>> for k=23:390
b{k,1}=a{k+1,1};
end
So to solve the problem you need to find if 4 consecutive locations have the correct sequence, you can test it like this
>> k = 1;
>> strfind(b{k},'hello')&strfind(b{k+1},'hey')&strfind(b{k+2},'hi')&(strfind(b{k+3},'hoi'))
ans =
logical
1
>>
Now you have to loop again in groups of 4
for k=1:4:390
if strfind(b{k},'hello')&strfind(b{k+1},'hey')&strfind(b{k+2},'hi')&(strfind(b{k+3},'hoi'))
% It complies, copy to a new cell, say d
d{k}=b{k};d{k+1}=b{k+1};d{k+2}=b{k+2};d{k+3}=b{k+3};
else
% it does not comply, the hi is missing
d{k}=b{k};d{k+1}=b{k+1};
d{k+3}=b{k+2};
%introduce the hi
d{k+2}='hi';
end
so the first part of the if is when the four cases are present, you copy, in the second you introduce. The only thing to take into account is that if the hi is missing you might have to alter the k so that it goes one before.
That should work, hopefully!
I would strongly recommend using strcmp instead of strfind. The above will completely fail if for example the sequence was {'ho', 'hoi'} instead of {'hi', 'hoi'}.
I would also recommend:
  • to preallocate d instead of growing (slowly) in the loop
  • using numel(b) for the end bound instead of an hardcoded bound which would need changing if b length changes
  • Using indices range for copying instead of individual copies (i.e. d{k:k+3} = b{k:k+3})
The above will also fail if more than 4 'hi' are missing.
edit: actually, the above will fail after the first missing 'hi' since from then one, the 'hello' is no longer on a multiple of 4 index.
I highly appreciate both of your inputs!

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!