Spliing text between characters
2 views (last 30 days)
Show older comments
Hello everyone, I have a very long list of questions with their answers. but their pattern is the same. I took a photo ocr. This is the pattern question number, question, three dots and and answer. I have hundrerds of question and answer like that. I want to separete it into question and answer for csv. is it possible to do it?
I used extractbetween for question.
extractbetween(text,'.','...')
"640. En selektif etkili antibiyotik...Penisilinler"
but with very long list of text I am stucked with both question and the answer .
can someone help me or tell is it possible.
0 Comments
Answers (2)
Mathieu NOE
on 1 Feb 2021
hello Ongun
I created a small txt file containing these lines (as example) :
"640. En selektif etkili antibiyotik...Penisilinler"
"641. En selektif etkili antibiyotik...Penisilinler1"
"642. En selektif etkili antibiyotik...Penisilinler2"
"643. En selektif etkili antibiyotik...Penisilinler3"
"644. En selektif etkili antibiyotik...Penisilinler4"
then I tested this code that generated the attached xlsx file
lines = readlines('Document1.txt');
for ci =1:numel(lines)
Qnumber_str = extractBefore(lines(ci),'.'); % extract question number (with double quote)
Qnumber_str = strrep(Qnumber_str, '"', ''); % remove start double quote
Qnumber{ci} = str2num(Qnumber_str); % convert to num
Question{ci} = extractBetween(lines(ci),'.','...'); % extract question
Answer_str = extractAfter(lines(ci),'...'); % extract answer
Answer_str = strrep(Answer_str, '"', ''); % remove end double quote
Answer{ci} = Answer_str;
end
A = [Qnumber' Question' Answer'];
T = array2table(A, 'VariableNames',{'Q number','Question','Answer'})
writetable(T,'test.xlsx')
0 Comments
Walter Roberson
on 1 Feb 2021
S = fileread('Document1.txt');
tokens = regexp(S, '^(?<Q>.+)\.{3}(?<A>).+)$', 'lineanchors', 'names', 'dotexceptnewline');
Questions = vertcat(tokens.Q);
Answers = vertcat(tokens.A);
T = table(Questions, Answers);
writetable(T, 'QandA.csv')
0 Comments
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!