Removing selected data using unique

Halo every one,
I have a question baout how to remov selected data base on such parameter. I want to remove the parameter of colom 1 and colom to based on colom 2 and colom 3.
The reuirement is the coloum 1 will be removed if the colom 3 has values coloum 3 <3.0 and column 3>-3.0. The same parameyers also be implemented for clumn 2 based on column 4<3.0 and column 4>-3.0.
Is there any one can help me how to use unique coding to do remove in column 1 and column 2.
Thx

1 Comment

There will be nothing left in the file:
D = load('InpUniqueSandSoil.txt');
[L,H] = bounds(D)
producing:
L =
0.2000 0.3476 -1.9411 -1.4528
H =
47.0000 29.5447 2.6284 2.9843

Sign in to comment.

 Accepted Answer

Image Analyst
Image Analyst on 27 Apr 2019
Edited: Image Analyst on 28 Apr 2019
Try
data = dlmread(filename, ' ');
col1 = data(:, 1);
col2 = data(:, 2);
col3 = data(:, 3);
col4 = data(:, 4);
% coloum 1 will be removed if the colom 3 has values coloum 3 <3.0 and column 3>-3.0.
rowsToDelete = col3 > -3 | col3 < 3;
col1(rowsToDelete) = []; % Method 1: set to null.
% clumn 2 based on column 4<3.0 and column 4>-3.0.
rowsToKeep = col4 <= -3 & col4 >= 3;
col2 = col2(rowsToKeep); % Method 2: extract only the ones you want.

12 Comments

Thank you Image Analyst but I forget to add an important information such as Column 1 & Column 2 is a pair data, means if we delete one data in column 1 otomatically will delete column 2.
Thx
It still remain not working
Well yes, that is an important thing you left out. So try this:
fileName = 'InpUniqueSandSoil.txt';
data = importdata(fileName)
col1 = data(:, 1);
col2 = data(:, 2);
col3 = data(:, 3);
col4 = data(:, 4);
% coloum 1 will be removed if the colom 3 has values coloum 3 <3.0 and column 3>-3.0.
% clumn 2 based on column 4<3.0 and column 4>-3.0.
rowsToDelete = (col3 > -3 | col3 < 3) & (col4 <= -3 & col4 >= 3);
col1(rowsToDelete) = []; % Set to null.
col2(rowsToDelete) = []; % Set to null.
I also follow your coding for keep the data:
rowsToKeep = (Zspt > -3 | Zspt < 3) & (Zqc >= -3 & Zqc <= 3);
X2(rowsToKeep) = []; % Set to null.
Y2(rowsToKeep) = []; % Set to null.
But I don't know the answer is:
Index of element to remove exceeds matrix dimensions.
Error in Complete_Linier_Regression (line 53)
X2(rowsToKeep) = []; % Set to null.
To keep data you don't set it to null! That deletes data. To keep data, you compute the indexes and extract only those, so use
X2 = X2(rowsToKeep);
Y2 = Y2(rowsToKeep);
So the coding like this:
rowsToDelete = (Zspt < -3 | Zspt > 3) & (Zqc <= -3 & Zqc >= 3);
X1(rowsToDelete) = []; % Set to null.
Y1(rowsToDelete) = []; % Set to null.
rowsToKeep = (Zspt > -3 | Zspt < 3) & (Zqc >= -3 & Zqc <= 3);
X2(rowsToKeep)= X2;
Y2(rowsToKeep)= Y2;
No. First of all, you'd need to do only one of those methods. Either
  1. delete the ones you don't want by setting them to null, OR
  2. extract only the ones you DO want
but not both! So I think you'd do
fileName = 'InpUniqueSandSoil.txt';
data = importdata(fileName)
col1 = data(:, 1);
col2 = data(:, 2);
col3 = data(:, 3);
col4 = data(:, 4);
% coloum 1 will be removed if the colom 3 has values coloum 3 <3.0 and column 3>-3.0.
% clumn 2 based on column 4<3.0 and column 4>-3.0.
rowsToDelete = (col3 > -3 & col3 < 3) | (col4 > -3 & col4 < 3);
col1(rowsToDelete) = []; % Set to null.
col2(rowsToDelete) = []; % Set to null.
That should delete the row from both col1 and col2 if EITHER col3 or col4 is in the range [-3, 3].
Okay it works know.
Can I ask another question. My final destination is to do a statistical linier regression.
I need one step more to make classifed the data.
So I classified into 3 classess.
The proccess is lik this:
The filtered soil data as the boundary values are divided into three classes, lower, middle and upper. The lowest limit of class 1 is the smallest value of the data and the highest limit of class 3 is the largest value of the data. Determine the range of data values that are used as boundary values between data classes, by determining the difference between the maximum X and Y values against the minimum X and Y values and the range data divided by the number of classes. For class 1 of SPT-N VALUES (Column 1) and CPT (Column 2), are obtained from the process of addition of minimum value class 1 with the value of data range. For class 2, it is obtained from the value of class 1 plus a value from data range and class 3 from class 2 plus a value from range of data. The screening out process is done by placing the sorted data according to class values, when class 1 of CPT (Column 2) data is limited to values below the SPT-N VALUES (Column 1) boundary value, then reduced by the minimum CPT value. CPT applied to this regression, must be smaller than the minimum CPT boundaries otherwise CPT value must be rejected.
in detail you can see in excel (attached). I want to know how to make processing like state in excel phase III (Classsified) - IV Final.
In the workbook, I don't see a class 1, just a class II, III, and IV. Is class II class 1, and class III class 2, and class IV actually class 3?
Plus, I don't know why class II has 4 columns, and the other two have 3 columns (one of which is called outliers). Please explain better what each column means and why there are different numbers of columns in each class.
Yes I mean class 1, 2 & 3 are represented in G5 until G165. It has characterized in colours. Please see the remarks on N9 - N11. the name of class II, III and IV are step of processing data to remove out liers. Once again I will develop a regression so I devide in 4 step. I mentioned in the workbook only the last thress step to get a regression formula y = 0.3079 + 0.9367 with a coefisien regression R2 = 0.2693. I will attach again the workbook with additional figure of regression.
Thx
Dear Image Analyst
I have done to compute standardize data by implementing limit condition as previous discussed with you. And I added additional information on the path to conduct linier regression. I don't know how to develop the coding for substraction process of ClassY1 until Class Y3 with considering ClassX1 until ClassX3
The next strep is develop interval class to detect outlier (Please focus on Step III-Classified and Step IV-Final) by :
  1. Define = Minimum (X min=4 and Y min=0.3475) and maximum (X max =40 and Y max =24.765) the values in column 1 and coulmn 2
  2. Define interval class by calculating the difference between X max and X is minimal and devided by 3 (three bin of class)
  3. Define limit condition of each class by Class X1 = X min + X mean, Class X2= Class1+X mean and Class X3= Class2 +X mean
  4. Define limit condition of each class by Class Y1 = Y min + Y mean, Class Y2= ClassY1+Y mean and Class Y3= ClassY2 +Y mean.
  5. Calculate the process for detect outlier in basis data by:
  6. a. Class Y1= Substract the value of Column2 (J5-J61) with the limit values of ClassY1 (T7)
  7. b. Class Y2=Substract the value of Column2 (J62-J139) with the limit values of ClassY2(U7)
  8. c. Class Y3=Substract the value of Column2 (J140 - J165) with the limit values of classY3(V7).
  9. Each of classY1, C;ass Y2 and ClassY3, the distribution of data will be limited by ClassX1 for Class Y1, ClassX2 for ClassY2 and Class X3 for Class Y3.
  10. The values of each class is higher than 1 is removed otherwise is compiled. When we removed the value of Column2, the value of column 1 also must be removed. Bcasue Column 1 and Column 2 are pairs of data.
  11. Calculate linier regression
Thx
I really doubt I'll be able to find enough time to carve out of my very busy schedule to delve into this. It doesn't look like a simple 5 minute task. Sorry but good luck. Call Mathworks tech support and they may be able to help.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!