Integrating bivariate normal distribution in polar coordinates efficiently

Question

supernoob on 2 Jul 2018

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/408464-integrating-bivariate-normal-distribution-in-polar-coordinates-efficiently

Commented: John D'Errico on 3 Jul 2018

Hi there,

I have a bivariate normal distribution which I would like to integrate over a specified area in polar coordinates. I have the distribution either as mvnpdf() or as a 2D matrix of values. I do not have it as an equation/expression, though I could cast it into one. What is the fastest way to integrate over this distribution in polar coordinates? I can see that I can evaluate double integrals in polar coordinates using integral2 (https://www.mathworks.com/help/matlab/ref/integral2.html) but I will first have to express my pdf as an equation in polar coordinates. I am concerned about speed/efficiency as I have to do this over and over again as part of a simulation. Thanks in advance!

2 Comments
Show NoneHide None

Torsten on 3 Jul 2018

See the example "Multiple Numerical Integrations" under

https://de.mathworks.com/help/matlab/ref/trapz.html

Best wishes

Torsten.

supernoob on 3 Jul 2018

Thanks! This is really helpful.

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 3 Jul 2018

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/408464-integrating-bivariate-normal-distribution-in-polar-coordinates-efficiently#answer_327246

Edited: John D'Errico on 3 Jul 2018

Open in MATLAB Online

Why do you need to re-express the PDF in polar form? It is one line of code to convert polar form into cartesian form to do the integration in the form you desire, thus in polar coordinates.

Integral2 will send points into your kernel function to evaluate the integrand. They will be in polar form. So what? In one line of code, convert them into cartesian coordinates.

help pol2cart

Send them into the multivariate normal PDF. This costs almost nothing in terms of time, as far more of the time will be taken by the PDF evaluation and by integral2 itself.

Don't trust me? Seriously, this takes NO effort at all to do. For example, what is the integral of a bivariate normal PDF within a unit circle of radius 1?

function K = polarkernel(r,theta)
  % convert to cartesian coords
  [x,y] = pol2cart(theta,r);
  Prtheta = reshape(mvnpdf([x(:),y(:)]),size(r));
  % don't forget to multiply by r, because the
  % differential term when you work in polar
  % coordinates converts from dx*dy, into r*dr*dtheta.
  K = Prtheta.*r;
end

Now, call integral2.

integral2(@polarkernel,0,1,0,2*pi)
ans =
         0.393469340287005

Now, is this the correct result? To verify that fact, lets do the conversion ourselves. Ok, I'll do it myself.

A standard bivariate normal distribution, with unit variances, and zero correlation & covariances is simple, even trivial to write:

P(x,y) = exp(-(x^2 + y^2)/2)/(2*pi)

In polar form? In this case, it is even simpler! This is because it is rotationally symmetric. And in polar form, (x^2+y^2) reduces to just r^2.

P(r,theta) = exp(-r^2/2)/(2*pi)

Even easier, is then to recognize the integrand itself is not dependent on theta. That is ONLY true in this specific case, because I chose a trivial case to do my test. If you have some non-trivial normal pdf, with non-zero correlation, you will still need to convert to cartesian coordinates as I showed. But this is a test to see if what I did worked!

The integral of P(r,theta)*r is simple here, because for a rotationaly symmetric kernel, the integral on theta becomes merely a multiplication by 2*pi. So the 1/(2*pi) just cancels out, goes away like dust on the wind.

I'll do it symbolically. Actually, I could do this part with paper and pencil. But then, you would need to trust my algebraic capabilities.

syms r
int(exp(-r^2/2)*r,[0,1])
ans =
1 - exp(-1/2)
vpa(ans)
ans =
0.39346934028736657639620046500882

Which, to within the default tolerances of integral2, is as close as it will get. Well, at least good enough for government work.

4 Comments
Show 2 older commentsHide 2 older comments

John D'Errico on 3 Jul 2018

Open in MATLAB Online

pol2cart is NOT inefficient. In fact, you talk about casting between cartesian and polar coordinates. What do you think pol2cart does?

Would it have made you happier if I had suggested

x = r.*cos(theta); y = r.*sin(theta):

instead? This is not a computation worth worrying about. It is not a question of writing to a table, of some complicated computation. pol2cart does EXACTLY what that reference indicates. All it does is a couple of trig function evals. If that is what you are worried about, believe me, you are worrying about the wrong thing.

A cardinal rule of writing code it to not pre-optimize your code, thus expend a lot of programmer time trying to eke out an extra millisecond. Don't spend the time until you find that something is a bottleneck. Get the computations right, THEN look to see if there is a time problem.

As far as the use of trapz goes here, you will spend a lot more time there for a similar accuracy, because trapz is quite a low order rule.

I'm sorry, but you may have your priorities wrong.

supernoob on 3 Jul 2018

Edited: supernoob on 3 Jul 2018

There's no need to be rude my guy. I'm learning and trying to optimize my code. I'm not sure what is and isn't efficient, that's why I asked. I've already written the code and am in the stage of optimizing. I'm looking for every little bit I can get as the simulation I am running is large. My question was literally "what is the fastest way to integrate over this distribution in polar coordinates". Also pol2cart is not useful to me. I would cast into polar coords, not the other way around. Thanks for your help, I appreciate your efforts. Have a nice day.

John D'Errico on 3 Jul 2018

Open in MATLAB Online

No. You simply have not understood anything I said. I'm not trying to be rude. Merely to explain to you that you are spending a lot of effort to optimize something that is almost certainly not worth the effort.

You stated that you want to integrate in polar coordinates. That means integral2 will be working on r and theta. You give it limits of integration on those parameters to define the domain. It will feed your kernel r and theta values in the domain of interest.

Then you need to convert r and theta into cartesian coordinates, because the multivariate Normal PDF is defined in that domain. Even if you have someother function that uses r and theta, multiplied by the multivariate normal PDF, you need x and y.

(Note: IF the bivariate normal PDF in question has a diagonal covariance matrix, then the conversion to polar form is actually quite simple. In fact, I did that for you in my answer. So IF the bivariate normal has a diagonal covariance matrix, with both variances identical, then you can indeed avoid the conversion into cartesian coordinates.) You have said nothing about the specific bivariate PDF at all, so I cannot know.

Is the conversion a complicated thing to do? Not even close. Nor is it even remotely expensive to compute.

For example, a typical call to integral2 will probably require something on the order of 1000 points to evaluate, to rarely as many as 10000 points, depending upon the integration tolerances, and the complexity of your function. How nonlinear is it? 10K is pretty high in my experience with that code on any average function. For an average case, I'll assume 3000 function evaluations. So lets guess 3000 conversions to cartesian coordinates for EACH time you call integral2.

How long does that take to do? Again, 3k conversions is FAST. For example, I'll do 30 million such conversions for a test, equivalent to calling integral2 10000 times.

r = rand(1,3e7);
th = rand(1,3e7)*2*pi;
timeit(@() pol2cart(th,r))
ans =
           0.5896013544015

So roughly 0.6 seconds for 30 million such operations. I could just divide by 10000 to infer the time for 3000 points. Or just use timeit again.

r = rand(1,3000);
th = rand(1,3000)*2*pi;
timeit(@() pol2cart(th,r))
ans =
    7.20480022142857e-05

Really short times like this are difficult to estimate exactly, which is why I did it for 30 million sets of r and theta the first time. But the point is, are there places in your code where you are wasting 0.00007 seconds of CPU time? I'll guarantee that there are dozens of places where your code is wasting 0.00007 seconds, many thousands of times. But even there, we might still be talking about a few seconds of CPU time wasted in total. And when we are talking about code written by a relatively new programmer in MATLAB, this is certainly small potatoes.

And that is my point. Don't spend time trying to pre-optimize your code. Don't waste your time trying to save such a small amount of time. This is a classic mistake that people make, and you are falling into that trap.

Sign in to comment.

Integrating bivariate normal distribution in polar coordinates efficiently

2 Comments
Show NoneHide None

Answers (1)

4 Comments
Show 2 older commentsHide 2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Integrating bivariate normal distribution in polar coordinates efficiently

2 Comments Show NoneHide None

Answers (1)

4 Comments Show 2 older commentsHide 2 older comments

See Also

Categories

Tags

Community Treasure Hunt

2 Comments
Show NoneHide None

4 Comments
Show 2 older commentsHide 2 older comments