You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
an Alternative funtion which is faster than "ismember"
9 views (last 30 days)
Show older comments
Hello everybody,
I was using ismembertol with XY(Nx2) and xy(Mx2). However code never ends due to the enormous amount of data(N=400million M=80mil.).
Is there any way that I can speed this function. (The matrices are not unique)
[LIA,~]= ismembertol(XY,xy,0.00001,'ByRows',true,'OutputAllIndices',true);
Thank you for your support
20 Comments
Walter Roberson
on 24 Nov 2022
is xy sorted in ascending order? If so then you can call an internal compiled routine, bypassing the sorting step.
Ahmet Hakan UYANIK
on 24 Nov 2022
it is not sorted actually, this XY is the grid coordinates and xy is the river coordinates. I wanted to check which of the grids are in the range of river and assign non-river part to NaN.
Bruno Luong
on 24 Nov 2022
Edited: Bruno Luong
on 24 Nov 2022
If XY is gridded coordinates, then you can use discretize or simple division if they are uniform to determine which grid the river point belong to.
Ahmet Hakan UYANIK
on 24 Nov 2022
how to use discretize function with coordinates? examples are generally given for 1D.
Also, yes XY is the gridded coordinates, normally they are meshgrid of X and Y but i did XY=[X(:) Y(:)], in order to use ismembertol
Bruno Luong
on 24 Nov 2022
Edited: Bruno Luong
on 24 Nov 2022
Yes apply discretize in 1D, whic edge points beeing the mid points of the linear grid 1D coordinates.
You do twice 1D, in x and y. Then they give you which xgrid coordinates is closest in x direction and which ygrid coordinates is the closest in y. Then combine both it provides which 2D grid (mesh)points is the closest.
Ahmet Hakan UYANIK
on 24 Nov 2022
[xbin,xbinx] = discretize(xy(:,1),X(:)); % which column?
[ybin, ybiny]= discretize(xy(:,2),Y(:),); % which row?
when I apply this, second line gives and error as "Bin edges must be a vector that is real, numeric or logical, and monotonically increasing". So i cannot get a proper index out of discretize to be used in XY.
Bruno Luong
on 24 Nov 2022
Edited: Bruno Luong
on 24 Nov 2022
No use the mid points of your linear xgrid NOT meshgrid X and Y. Invoke discretize in 1D.
Ahmet Hakan UYANIK
on 24 Nov 2022
Edited: Ahmet Hakan UYANIK
on 24 Nov 2022
I gave a try but unfortunately it does not give the indices of grid which is river. I mean it doesnt give the same result as ismembertol. Maybe i use it wrong but i have used it like this
[xbin,zzx]= discretize(xy(:,1),unique(X(:)));
[ybin,zzy]= discretize(xy(:,2),unique(Y(:)));
x_id = unique(xbin);
y_id = unique(ybin);
Zi(~y_id,~x_id)=Nan;
xy is the dense point cloud coordinates of the river and X,Y is meshgrid but when I use them with unique, it s just their centerpoint
Bruno Luong
on 24 Nov 2022
I repeat what I write: use the mid points of your linear xgrid
BTW Using unique(X(:) is waste of time and memory where X and Y are created with meshgrid. Use directly the linear grid argument you passed through meshgrid.
Ahmet Hakan UYANIK
on 24 Nov 2022
Edited: Ahmet Hakan UYANIK
on 24 Nov 2022
mid points means the unique x coordinate of X meshgrid and same for Y or? it is waste of memory i agree but in the end it is same coordinates that linear grid argument I passed through meshgrid.
Is it possible to constrain the distance up to some value when using discretize funtion? like do not accept if point is 100 m away from the grid.
Bruno Luong
on 24 Nov 2022
mid point is the mean of two (adjadcent) points.
Example in x; fdo the same in y then combine the results
xgrid = cumsum(randi(5,1,10))
xgrid = 1×10
3 8 13 16 19 20 22 26 30 32
x = min(xgrid)+rand(1,10)*(max(xgrid)-min(xgrid))
x = 1×10
31.8418 19.2050 4.6353 21.3939 25.1672 23.9722 16.4319 5.0414 15.4470 8.8187
midpoints = (xgrid(1:end-1)+xgrid(2:end))/2;
edges = [-Inf midpoints Inf];
iclosest = discretize(x, edges)
iclosest = 1×10
10 5 1 7 8 7 4 1 4 2
xgridclosest = xgrid(iclosest);
d = abs(xgridclosest-x)
d = 1×10
0.1582 0.2050 1.6353 0.6061 0.8328 1.9722 0.4319 2.0414 0.5530 0.8187
Ahmet Hakan UYANIK
on 24 Nov 2022
Edited: Ahmet Hakan UYANIK
on 24 Nov 2022
Thank you for the confirmation. It is really nice for the getting the river observations but I am somehow interested with the grid id. So vice versa of this example would be perfect. With this method, i get all the grid id again but i need to assign non-river grid id's.
So i need to do this in the end:
Z(~iclosest) = NaN;
According to example i gave, it should be
[LIA,~]= ismembertol(XY,xy,0.00001,'ByRows',true,'OutputAllIndices',true);
this give indices of the grid in the size of grid matrix. But your example gives result in the size of river matrix
Bruno Luong
on 24 Nov 2022
Edited: Bruno Luong
on 24 Nov 2022
You just need to create a LIA logical array of size of you meshgrid and initialize to FALSE, then assign the closest index found by TRUE.
It Just a question of reindexing array.
LIA = false(length(ygrid),length(xgrid));
LIA(sub2ind(size(LIA, iclosest_y, iclosest_x)) = true; % iclosest_y and iclosest_x is the closest 1D indexes as I show you earlier
Steven Lord
on 24 Nov 2022
Based on the description I think the fourth and fifth output from the histcounts2 function may be of interest to you.
Ahmet Hakan UYANIK
on 24 Nov 2022
Edited: Ahmet Hakan UYANIK
on 24 Nov 2022
this helps a lot. Thank you
I wanted to additionally ask if iclosest_y and iclosest_x are having bigger size than LIA but still consist of grid's indices. How can i subscript it? e.g my iclosest_y and iclosest_x has a length of 50000 but LIA is 6000x6000.
Bruno Luong
on 24 Nov 2022
Edited: Bruno Luong
on 24 Nov 2022
I don't understand "subscript it".
"It" is what?
Subscipt to do what?
Ahmet Hakan UYANIK
on 24 Nov 2022
My gridded LIA logical matrix has a size of 6000x6000 but since the number of real observations are approx 50000, iclosest_y and iclosest_x has a length of approx 50000 as well. iclosest_y, iclosest_x still consists of the indices of gridded data but I cannot assing it as you did:
LIA(sub2ind(size(LIA, iclosest_y, iclosest_x)) = true;
it says size are not matching
Bruno Luong
on 24 Nov 2022
Edited: Bruno Luong
on 24 Nov 2022
There is a typo, missing a parenthesis.
LIA(sub2ind(size(LIA), iclosest_y, iclosest_x)) = true;
You should verify if these three things are satisfied
- size(LIA) is [6000 6000]
- iclosest_x vector of ~ 50000 values in 1:6000 (6000 is size(LIA,2))
- iclosest_y vector of ~ 50000 values in 1:6000 (6000 is size(LIA,1))
or read the doc and try to understand what each function suppose to do.
Ahmet Hakan UYANIK
on 24 Nov 2022
I deeply appreciate the time and effort you took Bruno. It works perfectly without a loop which is amazing. I also would like to accept this answer but since it is in the comments, the botton does not apper.
Accepted Answer
Bruno Luong
on 24 Nov 2022
If XY is gridded coordinates, then you can use discretize or simple division if they are uniform to determine which grid the river point belong to.
% Generate some toy fake data
xgrid = cumsum(randi(5,1,10))
x = min(xgrid)+rand(1,10)*(max(xgrid)-min(xgrid))
midpoints = (xgrid(1:end-1)+xgrid(2:end))/2;
x_edges = [-Inf midpoints Inf];
iclosest_x = discretize(x, x_edges)
xgridclosest = xgrid(iclosest_x);
d = abs(xgridclosest-x)
Do the same for y, then
LIA = false(length(ygrid),length(xgrid));
LIA(sub2ind(size(LIA), iclosest_y, iclosest_x)) = true;
More Answers (0)
See Also
Categories
Find more on Matrices and Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)