How to efficiently compute a sum of large vectors with elements depend on the sum index?

Question

Hung Dao on 6 Feb 2021

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/737682-how-to-efficiently-compute-a-sum-of-large-vectors-with-elements-depend-on-the-sum-index

Commented: Walter Roberson on 7 Feb 2021

Accepted Answer: Walter Roberson

I have to compute the sum:

, where

are large column vectors ( dimension is about 10 millions) and κ is about 100 to 200.

My first attempt is

, with

(a row vector) and

denotes the element-wise product.

My second attempt is to use for loop.

Both attempt are still very slow but for loop is slightly faster.

My questions is : Is there any way to compute the sum more efficiently?

Any suggestion would be much appreciated !

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Walter Roberson on 6 Feb 2021

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/737682-how-to-efficiently-compute-a-sum-of-large-vectors-with-elements-depend-on-the-sum-index#answer_616242

Open in MATLAB Online

%assume that z and y are COLUMN vectors
K = (1:kappa).';  %COLUMN vector not row vector
result = sum( K.*sin(K.*z.').*exp(K.*y.') );

What this does is improve locality of reference. The values to be summed will be consecutive ones in the matrix, instead of being scattered around the matrix.

2 Comments
Show NoneHide None

Hung Dao on 7 Feb 2021

Thanks. I've tried your suggestion and it slightly improves.

My first attempt: 4.29s vs your suggestion: 3.8s which is good but is not as effecient as I expected.

Is it possible to reduce the computational time even more?

Thanks

Walter Roberson on 7 Feb 2021

In the case that there are a lot of duplicate values in z or y, then it might become practical to find the unique values, take the sin only for them, and get back to the appropriate extended value using indexing. This would, however, generally be slower because of the overhead in finding the duplicates. For Intel i9 "Coffee Lake" the cycle time for FSIN is 53 to 105 cyles, with another result available after 50 to 120 cycles (so if you have a bunch queued up you might shave 3 cycles off.) Therefore if you were able to do duplicate detection in less than 50 cycles, you might be able to save time overall. As a rough estimation, that would start to become practical if you had on the order of less than 10 unique values (maybe 20-ish even.)

Another situation in which you might be able to get some improvement is the case in which you are tight on memory, so that forming the full arrays is causing a lot of overhead on memory management.

What magnitude range do the z values have? If they are negative then K*z < roughly -36 is going to have an exp() that does not contribute significantly to the sum and it might be possible to filter out values. exp(400*z) is going to be either very big or very small unless abs(z) < 1/10 (approximately)

Sign in to comment.

How to efficiently compute a sum of large vectors with elements depend on the sum index?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

How to efficiently compute a sum of large vectors with elements depend on the sum index?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None