How does MATLAB implement the : operator?

3 views (last 30 days)
AlexRD
AlexRD on 24 Apr 2021
Edited: Jan on 26 Apr 2021
Hey folks, i've been learning how indexing works in matlab, and i've come upon a particular problem that i think would benefit from MATLAB's implementation.
So, let's imagine you have a source array that looks like this:
[1 4 7]
[2 5 8]
[3 6 9]
Because of how linear indexing works, i set these values specifically to match the linear index. So if you were to do something like array(:), you'd get [1 2 3 4 5 6 7 8 9].
Now, let's say you you want to access the first and third value from all columns, you'd do something like:
value = array([1, 3], :);
The : operator automatically adds the appropriate bias to the index on every column (so the index matches the linear index), and it does so very efficiently. If you were to mimic this in code, you'd do something like:
array = [1 4 7; 2 5 8; 3 6 9];
[numCols, numRows] = size(array);
index = repmat([1; 3], 1, numCols); % Repeats the index a number of times equal to the amount of columns
for i=1:numCols
index(:, i) = index(:, i) + (i-1) * numRows; % Adds the column index to each column so the linear index matches
end
% Result 1 and 2 are the same
result1 = array(index);
result2 = array([1,3], :);
My problem is, building the index manually takes a lot longer to complete than using the : operator, even if you're comparing the use where the index is already built. Please note, i am comparing the speed of a prebuilt index, so basically just the time it takes to run the line in result1 versus result2.
In my tests, i used huge arrays, because that's where i'll be using this. My guess is that the large array for the index ends up slowing down the code, even if you change the data type to uint16 for example. The only way to avoid this large index array would be to do this in a loop, something like this:
result = zeros(2, 3);
for i=1:numCols
result(:, i) = array((i-1) * numRows + [1 3]);
end
But this slows down the code significantly too. So how does Matlab implement the : operator?

Accepted Answer

Jan
Jan on 25 Apr 2021
Edited: Jan on 26 Apr 2021
The colon operator tells Matlab to use the complete array or dimension. Then Matlab does not have to test, if the indices are inside the valid range. If you create an index matrix manually, each element is checkd when the index is apllied:
X = reshape(1:9, 3, 3);
y = X(:, 1) % Fast access of the first column
y = X(1:3, 1) % Slower, because this must be performed:
% index1 = 1, index1 == round(index1) && index1 > 0 && index1 <= size(X, 1)
% index2 = 3, index2 == round(index2) && index2 > 0 && index2 <= size(X, 1)
% Then Matlab copies the block from index1 to index2
y = X([1,2,3], :) % Even slower, because now each element is checked, if
% if is inside the valid range.
% Then the data is copied elementwise, because in opposite to 1:3, Matlab
% cannot know, that the indices are contiguous.
The same applies for FOR loops:
for k = 1:1e6 % Fast
end
v = 1:1e6;
for k = v % Slower
end
This means that efficient code should use the colon operator for indexing, whenever this is possible. Using an array of indices is much more expensive.
i1 = 13; i2 = 270;
index = i1:i2;
v(index) % Slow
v(i1:i2) % Fast
Remember, that the colon operator can be applied to a dynamically determined dimension also using a cell array:
X = rand(3,4,5,6,7);
IndexC = {1, 2, ':', 3, 4};
X(IndexC{:})
  3 Comments
AlexRD
AlexRD on 25 Apr 2021
I see. Is there a way to avoid having Matlab check for errors? I couldn't get the cell array indexing to work.
Also, am i on a fool's errand here? Cause I'm trying to optimize the performance of an interpreted language, wouldn't i be better of just writing C++ code instead? Surely pointers would be the best implementation for this right? To have an array of pointers i mean.
Jan
Jan on 26 Apr 2021
Edited: Jan on 26 Apr 2021
There are 2 reasons why
v = 1:1e6; for k = v
is slower: 1. the vector v must be created explicitly. 2. Matlab cannot exploit, that the values of v are contiguos. The JIT acceleration can use such details to optimize the code.
"Is there a way to avoid having Matlab check for errors" - no, this is an important and fundamental part of Matlab.
"I couldn't get the cell array indexing to work." - this is another problem. Post the corresponding code and explain the problem.
Of course you can write C++ code instead. If the Matlab code has indexing problems, you get an error message. In C you do not get a message, but the application terminates.
The debugging in C and C++ is much harder. The time, until a problem is solved is the sum of:
time = design + programming + debugging&testing + documenting + runtime
Matlab is not the leading programming language with respect to runtime. But I can solve many problems much faster in Matlab. Compare the singular value decomposition:
% MATLAB:
[U, S, VT] = svd(A)
% C:
//dgesvd_(jobu, jobvt, m, n,
// A, LDA, S, U, LDU, VT, LDVT, work, lwork, info)
int_BLAS lwork = 256; // Must match problem size
double work[256]; // Optimal for DGESVD [3x3]: 201
BLASCALL(dgesvd)("A", "A", &M, &M, A, &M, S, U, &M, VT, &M,
work, &lwork, &info);

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!