Comparison between elements of matrix of different data type

5 views (last 30 days)
So I recently wrote a few line of code to compare adjacent pairs of a matrix where the values in the matrix are integers:
test_mat = [99 100 54 32 14; 89 4 41 2 3; 87 64 32 19 20];
the matrix i currently have is a matrix of 200,000x5. When i pass the matrix for comparison, it took about roughly 2 minutes to complete the comparison. however, i had another matrix where it contains:
test_mat2 = [0.0482 0.0050 0.0516 0.0063 0.0058; 0.0847 0.0008 0.0071 0.0086 0.0502];
and the one that I'm using is also a 200,000x5 matrix which contains data as the test_mat2 above. I notice that comparison takes way longer time compared to the first matrix of integers. Is there any reasoning behind this? Is comparison more expensive with numbers with decimals?
  3 Comments
Jan
Jan on 4 Sep 2019
Edited: Jan on 4 Sep 2019
@Guillaume: Your tests do not only compare the timing for the comparison, but also for the creation of the vectors. mdouble(1:2:end) needs more time than mint(1:2:end), because it has to allocate and write more bytes.
mint = randi([0 255], 2e5, 5, 'uint8');
mdouble = double(mint);
mdouble2 = mdouble + rand(2e5, 5);
timeit(@() mint == mint)
>> 0.000305
timeit(@() mdouble == mdouble)
>> 0.00031
timeit(@() mdouble2 == mdouble2)
>> 0.00031
The UINT8 comparison is cheaper, because for double the comparison NaN==NaN must be treated as an exception. It looks like this is implemented in the CPU already, such that both need the same time.
I'd expect a difference in the timings due to the memory band width, if the data do not match into the processor cache. I've tested this in Matlab online only, so please repeat the test on a real machine.
Guillaume
Guillaume on 4 Sep 2019
@Jan, indeed. However, there doesn't appear to be much difference in timing for allocating uint8 or double:
>> timeit(@() randi([0 255], 2e5, 5, 'uint8'))
ans =
0.012459
>> timeit(@() randi([0 255], 2e5, 5, 'double'))
ans =
0.01323

Sign in to comment.

Answers (1)

Nikhil Sonavane
Nikhil Sonavane on 4 Sep 2019
The way floating points are allocated in the memory is very different as compared to integers. Hence, the algorithm used for comparing floating point numbers is also different from that of integers. I would suggest you go through the Floating-Point Representation to understand this better. Also, the memory allocation in case of floating-point numbers is more than that of integers. For more information please refer to the documentation of integers and floating-point numbers.
  2 Comments
Jan
Jan on 4 Sep 2019
For the == operator the floating point representation matters only for NaNs, because NaN==NaN must reply false even if the bit representation is equal. For everything but NaN, comparing a double or a vector of 8 UINT8 is equivalent.
Guillaume
Guillaume on 4 Sep 2019
And of course, if the original vector is a 64-bit integer type, then there's the same number of bytes to compare. I would still expect double comparison to be marginally slower due to the need to test for NaN indeed. Plus if I recall correctly modern processors have different pipelines for FP and integer.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!