How could I match data from two files?

I have two .txt files. The first column of a FILE 1 contains some of the strings/ names that also exist in the first column of the FILE 2. In FILE 2 for each item / name corresponds a number in the same order. I want for each element of the first column of file 1 to find the number of column 2 of file 2.
[For example I want the first cell of FILE1 ("B") to correspont to number 56 of FILE2 etc - I am uploading these files]
Could you please help me ?
I am importing also the final file I want to create (FINAL.txt)

1 Comment

I would like my code:
1) Read file1 and file2
2)based on file 2 matches (eg number 56 corresponts to "Β", etc) to create a new column in file1.
3)Mainly , depending on the number of 1st column of file1 to write the number which responts to (based on file 2 matces) in an another column

Sign in to comment.

 Accepted Answer

T1 = readtable('FILE1.txt', 'ReadVariableNames',false)
T1 = 3x1 table
Var1 _____ {'B'} {'D'} {'A'}
T2 = readtable('FILE2.txt', 'ReadVariableNames',false)
T2 = 7x2 table
Var1 Var2 _____ ____ {'A'} 1 {'B'} 56 {'C'} 3 {'D'} 89 {'E'} 5 {'F'} 74 {'G'} 66
Tout = join(T1,T2)
Tout = 3x2 table
Var1 Var2 _____ ____ {'B'} 56 {'D'} 89 {'A'} 1
writetable(Tout, 'OutFile.txt', 'WriteVariableNames',false)
Or alternatively:
[X,Y] = ismember(T1.Var1,T2.Var1);
Tout = T1(X,:)

6 Comments

First of all thank you. But I am using your scipt for other files and command window shows me:
Error using .... (line 7)
Index in position 1 is invalid. Array indices must be positive integers or logical values.
I am uploading the files.
Could you please help me?
This error occurs because not all of the strings in FILE1.txt occur in FILE2.txt:
T1 = readtable('FILE1.txt', 'ReadVariableNames',false);
T2 = readtable('FILE2.txt', 'ReadVariableNames',false);
[X,Y] = ismember(T1.Var1,T2.Var1);
find(~X)
ans = 2×1
9 46
T1(~X,:) % these strings are not in FILE2.txt
ans = 2x1 table
Var1 ________ {'JAN3'} {'JAN3'}
Because you did not define how to handle missing strings, I did not include any such handling into my answer. Do you want to ignore them, or replace them with some default value, or some other action?
PS: it looks like you have used Excel to manipulate those data, and Excel has automatically converted "JAN3" (in FILE1.txt)
..
HER3
ITC1
JAN3 % looks okay!
KAC1
KAL3
..
into "3-Jan" (in FILE2.txt):
..
GRE3 170.22
HER2 546.6
HER3 545.34
ITC1 32.75
3-Jan 110.36 % Ooops, what happened here?
KAC1 101.17
KAL3 223.89
KAR2 137.4
..
What your files show exactly matches what is described here:
I have spent many years for a large multinational fighting Excel on exactly the issue of how if automagically imports, identifies, and converts data based on locale date formats and other number formats. Unfortunately there is no way to turn off this "feature", as many engineers and scientists have discovered.
Avoid any strings or short sequences of numbers and delimiters that could possibly be identified as date/number. Ensuring reliable importing of short strings over multiple locales is very very challenging (read: not really feasible).
Summary: if you want reliable data processing which does not alter your data without warning, avoid Excel.
Ok I fix it. I want to ask you, is there a way to add not only the first column of file 2 , but also the rest of columns? File 1 is 74x3 table and file 2 is 80x24table. Final file has 74x24 (Final File has the 1st column of file 1 and all the columns of file 2). I mean I want to add all the three columns of file 1 (not only the first column) to my final file.
"..is there a way to add not only the first column of file 2 , but also the rest of columns?"
The code I gave in my answer already saves all of the columns of FILE2.txt.
"I mean I want to add all the three columns of file 1 (not only the first column) to my final file.":
It is not clear to me how you will "add" 3 columns to 24 columns to get 24 columns.
If you want to compare 3 columns of the tables, use indexing as required to select the required inputs to ismember (which operates by row for table inputs).
Ivan Mich
Ivan Mich on 8 Mar 2021
Edited: Ivan Mich on 8 Mar 2021
I am importing the two input files (file1.txt and file2.txt) and the final file (final_new.txt) I would like to create. I f you see the first three columns of final file are the three columns of file1.txt.
Could you please help me?
Do you want to compare only the first column (as your question stated), or compare the first three columns?
This solution compares the the first column only:
T1 = readtable('FILE1.txt', 'ReadVariableNames',false);
T1.Properties.VariableNames(2:3) = {'Num1','Num2'}
T1 = 74x3 table
Var1 Num1 Num2 __________ ______ _______ {''AIG2''} 2 1.1453 {''ARG2''} 4.4286 4.6015 {''ARS1''} 2.6667 8.4634 {''CH01''} 3 9.8155 {''CH02''} 3 9.5476 {''HER2''} 2 0.39297 {''HER3''} 2 2.8911 {''ITC1''} 5 7.8801 {''JAN3''} 3.8571 2.499 {''KAC1''} 4.2857 4.2561 {''KAL3''} 3.125 3.8118 {''KAR2''} 3.6667 2.0716 {''KAS2''} 3 2.6124 {''KIF1''} 2.5263 4.8072 {''KLR1''} 3 7.1157 {''KOR2''} 2.8 2.6545
Note that your new FILE2.txt contains numeric data stored as char (also common with Excel):
T2 = readtable('FILE2.txt', 'ReadVariableNames',false)
T2 = 79x24 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14 Var15 Var16 Var17 Var18 Var19 Var20 Var21 Var22 Var23 Var24 __________ ______________ ______________ ____________ _________ ______ _________ ______ ____________ ______ ______ ______ _________ ______ ____________ ______ ______ ______ _________ ______ _____________ ______ ______ ______ {''AGN1''} {''35.18790''} {''25.71550''} {''596.71''} {''1.0''} {''''} {''HNN''} {''''} {''0.0155''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0139''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0125'' } {''''} {''''} {''''} {''AIG2''} {''38.24180''} {''22.07240''} {''137.06''} {''3.0''} {''''} {''HNZ''} {''''} {''0.2445''} {''''} {''''} {''''} {''HNE''} {''''} {''0.4431''} {''''} {''''} {''''} {''HNN''} {''''} {''0.4519'' } {''''} {''''} {''''} {''ALX2''} {''40.84560''} {''25.87380''} {''511.65''} {''1.0''} {''''} {''HNZ''} {''''} {''0.0178''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0142''} {''''} {''''} {''''} {''HNN''} {''''} {''0.0116'' } {''''} {''''} {''''} {''ARE2''} {''36.66640''} {''22.38330''} {''270.81''} {''2.1''} {''''} {''HNN''} {''''} {''0.0646''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0600''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0877'' } {''''} {''''} {''''} {''ARG2''} {''38.17840''} {''20.48780''} {''49.53'' } {''5.6''} {''''} {''HNN''} {''''} {''8.7649''} {''''} {''''} {''''} {''HNZ''} {''''} {''5.6940''} {''''} {''''} {''''} {''HNE''} {''''} {''10.1653''} {''''} {''''} {''''} {''ARS1''} {''37.63520''} {''22.72890''} {''218.71''} {''3.1''} {''''} {''HNN''} {''''} {''0.4260''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.1866''} {''''} {''''} {''''} {''HNE''} {''''} {''0.2739'' } {''''} {''''} {''''} {''CH01''} {''35.51690''} {''24.02060''} {''462.51''} {''1.4''} {''''} {''HNN''} {''''} {''0.0334''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0391''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0239'' } {''''} {''''} {''''} {''CH02''} {''35.51440''} {''24.03150''} {''463.36''} {''1.3''} {''''} {''HNN''} {''''} {''0.0371''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0222''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0373'' } {''''} {''''} {''''} {''FLO2''} {''40.78010''} {''21.40510''} {''240.95''} {''2.3''} {''''} {''HNZ''} {''''} {''0.0443''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0971''} {''''} {''''} {''''} {''HNN''} {''''} {''0.1230'' } {''''} {''''} {''''} {''FRS1''} {''39.29350''} {''22.38440''} {''169.23''} {''2.4''} {''''} {''HNN''} {''''} {''0.1736''} {''''} {''''} {''''} {''HNE''} {''''} {''0.1396''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.1039'' } {''''} {''''} {''''} {''GRE3''} {''40.08480''} {''21.43830''} {''170.22''} {''2.5''} {''''} {''HNN''} {''''} {''0.1892''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.1138''} {''''} {''''} {''''} {''HNE''} {''''} {''0.1979'' } {''''} {''''} {''''} {''HER2''} {''35.33790''} {''25.13550''} {''546.60''} {''1.3''} {''''} {''HNN''} {''''} {''0.0344''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0225''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0381'' } {''''} {''''} {''''} {''HER3''} {''35.32960''} {''25.10650''} {''545.34''} {''1.3''} {''''} {''HNN''} {''''} {''0.0216''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0525''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0381'' } {''''} {''''} {''''} {''ITC1''} {''38.36460''} {''20.71560''} {''32.75'' } {''5.7''} {''''} {''HNZ''} {''''} {''4.7129''} {''''} {''''} {''''} {''HNE''} {''''} {''7.8775''} {''''} {''''} {''''} {''HNN''} {''''} {''11.9261''} {''''} {''''} {''''} {''JAN3''} {''39.68380''} {''20.83770''} {''110.36''} {''3.1''} {''''} {''HNZ''} {''''} {''0.2608''} {''''} {''''} {''''} {''HNE''} {''''} {''0.6242''} {''''} {''''} {''''} {''HNN''} {''''} {''0.4724'' } {''''} {''''} {''''} {''KAC1''} {''38.13810''} {''21.54810''} {''101.17''} {''3.9''} {''''} {''HNZ''} {''''} {''0.7523''} {''''} {''''} {''''} {''HNE''} {''''} {''1.5065''} {''''} {''''} {''''} {''HNN''} {''''} {''2.1294'' } {''''} {''''} {''''}
Tout = join(T1,T2)
Tout = 74x26 table
Var1 Num1 Num2 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14 Var15 Var16 Var17 Var18 Var19 Var20 Var21 Var22 Var23 Var24 __________ ______ _______ ______________ ______________ ____________ _________ ______ _________ ______ ____________ ______ ______ ______ _________ ______ ____________ ______ ______ ______ _________ ______ _____________ ______ ______ ______ {''AIG2''} 2 1.1453 {''38.24180''} {''22.07240''} {''137.06''} {''3.0''} {''''} {''HNZ''} {''''} {''0.2445''} {''''} {''''} {''''} {''HNE''} {''''} {''0.4431''} {''''} {''''} {''''} {''HNN''} {''''} {''0.4519'' } {''''} {''''} {''''} {''ARG2''} 4.4286 4.6015 {''38.17840''} {''20.48780''} {''49.53'' } {''5.6''} {''''} {''HNN''} {''''} {''8.7649''} {''''} {''''} {''''} {''HNZ''} {''''} {''5.6940''} {''''} {''''} {''''} {''HNE''} {''''} {''10.1653''} {''''} {''''} {''''} {''ARS1''} 2.6667 8.4634 {''37.63520''} {''22.72890''} {''218.71''} {''3.1''} {''''} {''HNN''} {''''} {''0.4260''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.1866''} {''''} {''''} {''''} {''HNE''} {''''} {''0.2739'' } {''''} {''''} {''''} {''CH01''} 3 9.8155 {''35.51690''} {''24.02060''} {''462.51''} {''1.4''} {''''} {''HNN''} {''''} {''0.0334''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0391''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0239'' } {''''} {''''} {''''} {''CH02''} 3 9.5476 {''35.51440''} {''24.03150''} {''463.36''} {''1.3''} {''''} {''HNN''} {''''} {''0.0371''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0222''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0373'' } {''''} {''''} {''''} {''HER2''} 2 0.39297 {''35.33790''} {''25.13550''} {''546.60''} {''1.3''} {''''} {''HNN''} {''''} {''0.0344''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0225''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0381'' } {''''} {''''} {''''} {''HER3''} 2 2.8911 {''35.32960''} {''25.10650''} {''545.34''} {''1.3''} {''''} {''HNN''} {''''} {''0.0216''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0525''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0381'' } {''''} {''''} {''''} {''ITC1''} 5 7.8801 {''38.36460''} {''20.71560''} {''32.75'' } {''5.7''} {''''} {''HNZ''} {''''} {''4.7129''} {''''} {''''} {''''} {''HNE''} {''''} {''7.8775''} {''''} {''''} {''''} {''HNN''} {''''} {''11.9261''} {''''} {''''} {''''} {''JAN3''} 3.8571 2.499 {''39.68380''} {''20.83770''} {''110.36''} {''3.1''} {''''} {''HNZ''} {''''} {''0.2608''} {''''} {''''} {''''} {''HNE''} {''''} {''0.6242''} {''''} {''''} {''''} {''HNN''} {''''} {''0.4724'' } {''''} {''''} {''''} {''KAC1''} 4.2857 4.2561 {''38.13810''} {''21.54810''} {''101.17''} {''3.9''} {''''} {''HNZ''} {''''} {''0.7523''} {''''} {''''} {''''} {''HNE''} {''''} {''1.5065''} {''''} {''''} {''''} {''HNN''} {''''} {''2.1294'' } {''''} {''''} {''''} {''KAL3''} 3.125 3.8118 {''37.02460''} {''22.10300''} {''223.89''} {''3.3''} {''''} {''HNN''} {''''} {''0.4451''} {''''} {''''} {''''} {''HNE''} {''''} {''0.5041''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.2471'' } {''''} {''''} {''''} {''KAR2''} 3.6667 2.0716 {''39.36620''} {''21.91950''} {''137.40''} {''3.7''} {''''} {''HNN''} {''''} {''1.2908''} {''''} {''''} {''''} {''HNE''} {''''} {''0.6620''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.3913'' } {''''} {''''} {''''} {''KAS2''} 3 2.6124 {''40.50500''} {''21.28030''} {''208.57''} {''1.7''} {''''} {''HNE''} {''''} {''0.0632''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0476''} {''''} {''''} {''''} {''HNN''} {''''} {''0.0687'' } {''''} {''''} {''''} {''KIF1''} 2.5263 4.8072 {''38.07730''} {''23.81460''} {''288.52''} {''2.1''} {''''} {''HNE''} {''''} {''0.0893''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0587''} {''''} {''''} {''''} {''HNN''} {''''} {''0.0907'' } {''''} {''''} {''''} {''KLR1''} 3 7.1157 {''40.58240''} {''22.94950''} {''291.61''} {''1.8''} {''''} {''HNN''} {''''} {''0.0656''} {''''} {''''} {''''} {''HNE''} {''''} {''0.0420''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.0275'' } {''''} {''''} {''''} {''KOR2''} 2.8 2.6545 {''37.94010''} {''22.94950''} {''220.83''} {''2.8''} {''''} {''HNN''} {''''} {''0.2458''} {''''} {''''} {''''} {''HNE''} {''''} {''0.2490''} {''''} {''''} {''''} {''HNZ''} {''''} {''0.1630'' } {''''} {''''} {''''}
writetable(Tout, 'OutFile.txt', 'WriteVariableNames',false)

Sign in to comment.

More Answers (0)

Asked:

on 7 Mar 2021

Edited:

on 8 Mar 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!