Sort Excel files by file content?
Show older comments
Hi all, my new challenge with matlab involves filtering files with very inconsistent names. For example,
s =
'HI_B2_TTT9_D452_07052016.xlsx'
'HI_H2G_TTT7_D259_070516.xlsx'
'HI_B2C_TTT9_D1482_070516.xlsx'
'HI_A1C_468_070516_TTT4.xlsx'
'HI__TTT8_862_07052016_G1C.xlsx'
'HI_KA6_TTT4_148_07052016.xlsx'
'8C_HI_279_Potato_07052016.xlsx'
'HI_8C_279_Bacon_TTT52016.xlsx'
The files that I want are the first six files, which have different styles of naming even though they are the same type of files (TTT). While the last two files are undesired and need to be filter out indicated by keywords such as "Potato" and "Bacon".
My goal is to Extract files that contains the keywords "TTT" while eliminate files that have keywords "Potato" and "Bacon", this is not ideal since there are in fact hundreds of these files in my folders outside of this simple example that are constantly updating and I will need to look through them all for other potential unwanted keywords such as "Sour Cream", etc.
My ideal goal will be to extract those TTT files by its content, since all the TTT excel files have a sheet named "cooking is fun" inside while all the other ones do not. Is this feasible and is there a best way to do so?
Thank you so much for reading my concern and any inputs will be greatly appreciated!
5 Comments
Azzi Abdelmalek
on 6 Jul 2016
What about this name: 'HI_8C_279_Bacon_TTT52016.xlsx' ? it contains TTT and Bacon!
chlor thanks
on 6 Jul 2016
Azzi Abdelmalek
on 6 Jul 2016
Look at these two examples:
'HI_B2C_TTT9_D1482_070516.xlsx'
'HI_8C_279_Bacon_TTT52016.xlsx'
What is your criterion to filter the first one?
chlor thanks
on 6 Jul 2016
Azzi Abdelmalek
on 6 Jul 2016
Look at edited answer
Accepted Answer
More Answers (1)
Azzi Abdelmalek
on 6 Jul 2016
Edited: Azzi Abdelmalek
on 6 Jul 2016
Edited
a=regexp(s,'.+TTT.+','match','once')
b=regexprep(a,'\S+Bacon\S+','')
out=s(~cellfun(@isempty,b))
3 Comments
chlor thanks
on 6 Jul 2016
Guillaume
on 6 Jul 2016
Certainly not! More likely, you'd get a syntax error. I'd recommend you learn the regular expression language (note that this is not a language specific to matlab).
If you want to replace Bacon or Potato, this regex would work:
regexprep(a, '\S+(?:Bacon|Potato)\S+', '')
However, this has nothing to do with the original question: "Sort Excel files by file content?"
chlor thanks
on 6 Jul 2016
Categories
Find more on Spreadsheets in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!