What is the efficient code?

Question

MIN HOON on 12 Mar 2023

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/1927460-what-is-the-efficient-code

Edited: Andy Bartlett on 14 Mar 2023

I got a blow warning message in model advisor.

warning:

The following blocks will invoke net slope computation for multiplication.
The net slope computation can be implemented by: multiplication-and-shift or integer multiplication and/or division.
Changing the Configuration Parameters > Optimization > Use division for fixed-point net slope computation setting to On might generate more efficient code.
For example, net slope computation from fixdt(1, 16, 7/10, 0) to fixdt(1, 16, 1, 0) can be achieved by Qy = (Qu*7)/10 instead of Qu*11469 >> 14.
To use this option, change the Integer rounding mode parameter of the following blocks to Simplest or to the Configuration Parameters > Hardware Implementation > Production Hardware > Signed integer division rounds to setting.

In this warning message, what is the efficient code meaning?

Readability? or Processing speed?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Walter Roberson on 12 Mar 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1927460-what-is-the-efficient-code#answer_1191505

If you are using fixdt() then you are likely targetting hardware.

If that hardware does not have a division operator, or the division is slow, or the division does not support the data types you need, then use multiplication-and-shift.

If the hardware has a division operator that supports the data types you need, then probably it would be faster to use that hardware.

Note that if you are targetting FPGA then leaving out all division can save a notable amount of gates. Leaving out all floating point operations of any kind can save a lot of gates for FPGA . But if you need to compute with a range of values such that fixed-point becomes awkward, then it might be worth linking in a floating-point core.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Andy Bartlett on 14 Mar 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1927460-what-is-the-efficient-code#answer_1193040

Edited: Andy Bartlett on 14 Mar 2023

Simple Ansewer: Use shift approach unless multiplicative constant is really big

In my experience, the multiply by constant followed by a shift will be more efficient than division in most cases.

The exceptions occur when the multiplicative constant is quite big and forces a more combersome multiplication to be performed, especially a multiword multiply.

Improvements to the handling of slope-bias casts in recent releases will make the occurance of big multiplicative constants less likely. Also, slope-bias scaling is most frequently used with types with 16 or fewer bits. Small wordlengths make it more likely that the multiplicative constant can be implemented quite efficiently.

More Details

As Walter noted, if you have a target that does not have an integer division instruction, such as an ARM Cortex-M0, then you want to avoid division. With no hardware instruction, division will need to be implemented by sequencing many other operations.

But division by a constant value is a special case. Even when a integer division hardware operation is available, it often takes many more clock cycles than other operations. Many compilers have lots of tricks to avoid using a division operation when the denominator is a constant.

Example with Godbolt Compiler Explorer

Here is an example using the Godbolt online compiler explorer.

For gcc targeting Cortex-M4 (which does have an integer division operation) with these compiler options

-O2 -march=armv7e-m -mtune=cortex-m4

the division approach requires more code, but it doesn't actually use a division. The compiler exploited that the denominator was a constant and optimized it away. Instead of division, it used multiply and shifts. Sound familiar?

With gcc targeting Cortex-M0 with these options

-O2 -mtune=cortex-m0

the compiler has "optimized away" division by constant and multiplication by constant. Multiplication is still a little smaller. I did not count the clock cycles, but I'm guessing the approach that originated as a multiply-then-shift is faster than then the approach that originated as division-then-multiply.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

What is the efficient code?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

What is the efficient code?

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments