MATLAB Answers

Herbert
0

Alternative Matrix Multiplication which does euclidean distance instead of dot product

Asked by Herbert
on 7 Jun 2013
Of course, a great plus of matlab is to be able to do a matrix multiplication without writing 3 loops, and with lots of syntax sugar (for example, imagine having to write C = A.mult(B) each time). Moreover, it will be fast because the interpreter just reads 3 symbols (namely: A*B) and turn to assembly or whatever low level code matlab uses there.
However, what I often have is that I have a set of data points of multiple dimensions, for example 3, and another set of data points, again with for example 3 dimensions. Then I want to compare each data point of one set to another using a distance measure. If my distance measure is a dot product, then of course a matrix product would suffice:
A : m x 3 matrix
B : 3 x n matrix
C=A*B, then C(i,j) is the distance between data point i in set A and data point j in set B.
But what if my distance measure is euclidean? Then I would like to be able to do the same. However, whenever the * operation does a multiplication, it should subtract and square, and whenever the * operation does a sum, I also want it to do a sum. C=A*B would create a simular matrix as above, yet using euclidean distance.
Of course, I can simulate this writing a function, maybe one forloop and some matrix magic. However, I want (1) syntax sugar and (2) fast c/assembly implementation. Moreover, I think such a use case is actually quite realistic. I often want to find for each x out of M data points to which of some other reference database of N data points, x lies closest (using euclidean distance). For example in k-means when you try to decide to wich centroid each data point belongs.
So, does this exist? Would this be a nice feature? Or am I the only one who sees this as a 'nice to have'? Should I write a c-file for this and mex it?

  0 Comments

Sign in to comment.

1 Answer

Answer by Matt J
on 7 Jun 2013
Edited by Matt J
on 7 Jun 2013
 Accepted Answer

You can get something like what you're after using this
and using my INTERDISTS utiltity below. For example
>> A=rand(2,3), B=rand(3,2),
A =
0.9448 0.4893 0.9001
0.4909 0.3377 0.3692
B =
0.1112 0.2417
0.7803 0.4039
0.3897 0.0965
>> Ops.mtimes=@(a,b)interdists(a.',b);
>> A=DataObj(A,'Ops',Ops); B=DataObj(B,'Ops',Ops);
>> Euclidean = A*B
Euclidean =
1.0198 1.0712
0.5834 0.3753
My overall experience though is that syntactic sugar is rarely ever worth it. Better just to use the interdists function directly.
function Graph=interdists(A,B)
%Finds the graph of distances between point coordinates
%
% (1) Graph=interdists(A,B)
%
% in:
%
% A: matrix whose columns are coordinates of points, for example
% [[x1;y1;z1], [x2;y2;z2] ,..., [xM;yM;zM]]
% but the columns may be points in a space of any dimension, not just 3D.
%
% B: A second matrix whose columns are coordinates of points in the same
% Euclidean space. Default B=A.
%
%
% out:
%
% Graph: The MxN matrix of separation distances in l2 norm between the coordinates.
% Namely, Graph(i,j) will be the distance between A(:,i) and B(:,j).
%
%
% (2) interdists(A,'noself') is the same as interdists(A), except the output
% diagonals will be NaN instead of zero. Hence, for example, operations
% like min(interdists(A,'noself')) will ignore self-distances.
%
% See also getgraph
noself=false;
if nargin<2
B=A;
elseif ischar(B)&&strcmpi(B,'noself')
noself=true;
B=A;
end
N=size(A,1);
B=reshape(B,N,1,[]);
Graph=l2norm(bsxfun(@minus, A, B),1);
Graph=squeeze(Graph);
if noself
n=length(Graph);
Graph(linspace(1,n^2,n))=nan;
end

  3 Comments

Based on your code, you really took the time to understand and solve my problem, which I appreciate a lot! I was not aware of bsxfun, which I think is also useful in other situations where one wants to do some operation cartesion wise on two sets of vectors. Btw, I don't think it's a good idea to actually 'overwrite' the mtimes operations, as I can not imagine matlab without the original matrix multiplicaiton :) However, your answer is clear and much appreciated!
Btw, I don't think it's a good idea to actually 'overwrite' the mtimes operations, as I can not imagine matlab without the original matrix multiplicaiton
Ah, well, the original mtimes isn't gone. You just have to convert back to the ordinary matrix data type to use it,
>> Euclidean = A*B
Euclidean =
0.6025 0.6298
0.7382 0.2923
>> DotProduct= double(A)*double(B)
DotProduct =
0.5023 0.7829
0.6347 1.1619

Sign in to comment.