Main Content

HDL Optimization Properties

Optimize speed or area of generated HDL code

With the HDL optimization properties, you can specify speed vs. area tradeoffs in the generated code.

Specify these properties as name-value arguments to the generatehdl function. Name is the property name and Value is the corresponding value. You can specify several name-value arguments in any order as 'Name1',Value1,...,'NameN',ValueN.

For example:

fir = dsp.FIRFilter('Structure','Direct form antisymmetric');
generatehdl(fir,'InputDataType',numerictype(1,16,15),'AddPipelineRegisters','on');

Speed Optimization

expand all

Optimize clock rate with pipeline registers, specified as 'off' or 'on'. You cannot use this property with fully serial or cascade serial filters. When you set this property to 'on', the coder adds pipeline registers between filter computation stages. Although the registers add to the overall filter latency, they provide significant improvements to the clock rate.

Filter TypeLocation of Added Pipeline Register
FIR transposedBetween coefficient multipliers and adders
Direct form FIR, antisymmetric FIR, and symmetric FIR

Between levels of a tree-based final adder

For an alternative tree-based summation technique, see also the property FIRAdderStyle.

IIRBetween sections
CICBetween comb sections

For more details, see Optimizing the Clock Rate with Pipeline Registers.

Optimize clock rate with summation technique, specified as 'linear', 'tree', or 'pipelined'. This property applies only to direct form FIR, antisymmetric FIR, and symmetric FIR filters. You cannot use this property with fully serial or cascade serial filters. When you set this property to 'tree', the coder creates a final adder that performs pairwise addition on successive products that execute in parallel, rather than sequentially. When you set this property to 'pipelined', the coder creates a tree-based final adder with pipeline registers between the levels of the tree.

For more details, see Optimizing Final Summation for FIR Filters.

Dependencies

This property applies only when the AddPipelineRegisters property is set to 'off'.

Extra input register, specified as 'on' or 'off'. When this property is set to 'on', the coder generates a signal named input_register and includes a process statement that controls the register. If the incurred latency is a concern, or if the filter is incorporated into a code that has an existing input register, set this property to 'off'. For more details, see Specifying or Suppressing Registered Input and Output.

Extra output register, specified as 'on' or 'off'. When this property is set to 'on', the coder generates a signal named output_register and includes a process statement that controls the register. If the incurred latency is a concern, or if the filter is incorporated into a code that has an existing output register, set this property to 'off'. For more details, see Specifying or Suppressing Registered Input and Output.

Number of pipeline stages on multiplier inputs, specified as a nonnegative integer. This property applies only to FIR filters. Multiplier pipelining can significantly increase clock rates. For more details, see Multiplier Input and Output Pipelining for FIR Filters.

Dependencies

To enable this property, set CoeffMultipliers to 'multipliers'.

Number of pipeline stages on multiplier outputs, specified as a nonnegative integer. This property applies only to FIR filters. Multiplier pipelining can significantly increase clock rates. For more details, see Multiplier Input and Output Pipelining for FIR Filters.

Dependencies

To enable this property, set CoeffMultipliers to 'multipliers'.

Area Optimization

expand all

HDL code optimization, specified as 'off' or 'on'. By default, the coder generates the literal implementation of the filter with numeric behavior that matches the filter object exactly. This implementation is not necessarily an optimal HDL implementation. When this property is set to 'on', the coder reduces the area of the hardware implementation and optimizes data types and quantization effects. For more details about the underlying tradeoffs, see Optimize for HDL.

Implementation of coefficient multiplications, specified as 'multiplier', 'csd', or 'factored-csd'. You cannot use this property with multirate or serial filters.

  • 'multiplier' — The coder retains multiplier logic in the generated HDL code.

  • 'csd' or 'factored-csd'— The coder implements multiplication using canonical signed digit (CSD) logic. The CSD technique replaces multipliers with shift and add logic. This technique also minimizes the number of adders used for constant multiplication by representing binary numbers with a minimum count of nonzero digits. This optimization decreases the area used by the filter while maintaining or increasing clock speed.

  • 'factored-csd' — The coder implements multiplication using factored CSD logic. Factored CSD replaces multiplier operations with shift and add operations on prime factors of the coefficients. This option achieves a greater area reduction than CSD, at the cost of decreasing clock speed.

For more details, see CSD Optimizations for Coefficient Multipliers.

Partitions for serial filter architectures, specified as one of the following:

  • -1 — The coder generates a fully parallel architecture. This architecture is equivalent to a serial partition defined as a vector of ones of the size of the effective filter length.

  • Effective filter length — The coder generates a fully serial architecture.

  • [p1 p2 ... pN] — The coder generates a partly serial architecture with N partitions. The integers in the vector specify the length of each partition. The sum of the vector elements must be equal to the effective filter length. To reduce the area further, you can generate a cascade-serial architecture by enabling the ReuseAccum property. For some examples, see Generate Serial Partitions for FIR Filter.

  • Cell array of serial partitions — The coder generates partitions for each filter stage in a cascaded filter. Specify the partitions for each filter stage as -1, the effective filter length, or a vector of integers. The elements of each vector must sum to the effective filter length of the associated filter in the cascade. For an example, see Generate Serial Partitions of Cascaded Filter.

    When the serial partition of a filter stage is set to -1, you can specify a LUT partition for that stage by using the DALUTPartition and DARadix properties. For more details, see Architecture Options for Cascaded Filters.

You cannot use this property with IIR SOS filters. To generate serial architectures for IIR SOS filters, use the FoldingFactor or NumMultipliers properties instead.

Use this table as a guide for calculating the effective filter length. Alternatively, you can use the hdlfilterserialinfo function to display the effective filter length and possible partitions for a filter.

Filter TypeEffective Filter Length Calculation
Direct form FL = length(find(filt.Numerator~= 0))
Direct form symmetricFL = ceil(length(find(filt.Numerator~= 0))/2)
Direct form antisymmetric

For more details, see Specifying Speed vs. Area Tradeoffs via generatehdl Properties.

For an overview of parallel and serial architectures and a list of filter types supported for each architecture, see Speed vs. Area Tradeoffs.

Accumulator reuse for cascade-serial architecture, specified as 'off' or 'on'. When this property is set to 'on', the coder groups filter taps into several serial partitions. The accumulated output of each partition is cascaded to the accumulator of the previous partition. The output of the partitions is therefore computed at the accumulator of the first partition. This technique, called accumulator reuse, saves chip area. If the property SerialPartition is not defined, the coder generates an optimal partition. For more details, see Specifying Speed vs. Area Tradeoffs via generatehdl Properties.

For an overview of parallel and serial architectures and a list of filter types supported for each architecture, see Speed vs. Area Tradeoffs.

Lookup table (LUT) partitions for distributed arithmetic (DA), specified as one of the following:

  • -1 — The coder generates a fully parallel architecture.

  • Effective filter length — The coder generates a DA implementation without LUT partitioning.

  • [p1 p2 ... pN] — The coder generates a DA implementation with N LUT partitions. The integers in the vector specify the size of each partition. The maximum size for an individual partition is 12. The sum of the vector elements must be equal to the effective filter length. For multirate filters, each polyphase subfilter uses the same LUT partitions. For an example, see Distributed Arithmetic for Single Rate Filters.

  • {p1 p2 ... pN; q1 q2 ... qN; ... } — The coder generates a DA implementation with N unique LUT partitions for each polyphase subfilter of a multirate filter. Each row of the matrix specifies the partitions for one subfilter. The elements in each row must sum to the associated subfilter length, FLi. For an example, see Distributed Arithmetic for Multirate Filters.

  • Cell array of DALUT partitions — The coder generates DA implementation with different LUT partitions for each filter stage of the cascade. Specify the LUT partitions for each filter stage as -1, the effective filter length, or a vector of integers. The elements of each vector must sum to the effective filter length of the associated filter in the cascade. For an example, see Distributed Arithmetic for Cascaded Filters.

    When the LUT partition of a filter stage is set to -1, you can specify a serial partition for that stage by using the SerialPartition property. For more details, see Architecture Options for Cascaded Filters.

Use this table as a guide for calculating the effective filter length. Alternatively, you can use the hdlfilterdainfo function to display the effective filter length, LUT partitioning options, and possible DARadix values for the filter.

Filter TypeEffective Filter Length Calculation
Direct form FL = length(find(filt.Numerator~= 0))
Direct form symmetricFL = ceil(length(find(filt.Numerator~= 0))/2)
Direct form antisymmetric
Multirate with uniform LUT partitions for each polyphase subfilterFL = size(polyphase(filt),2)
Multirate with unique LUT partitions for each polyphase subfilterp = polyphase(filt)
FLi = length(find(p(i,:)))
, where i is the index to the ith row of the polyphase matrix of the filter. The ith row of the matrix p represents the ith subfilter.

For more details, see Distributed Arithmetic for FIR Filters.

Number of bits processed simultaneously in distributed arithmetic (DA), specified as 2, 2N, or {2N,2M,...} where:

  • N > 0

  • mod(W,N) = 0, where W is the input word size of the filter

  • 2N <= 2W

This property specifies a degree of parallelism in the DA architecture which can improve clock speed at the expense of area.

  • 21 — The coder implements a fully serial DA architecture that processes 1 bit at a time.

  • 2N — The coder generates a partly serial DA architecture when 1 < N < W.

  • 2W — The coder generates a fully parallel DA architecture.

  • {2N,2M,...} — The coder generates a DA implementation with different DARadix values for each filter stage in a cascaded filter. For an example, see Distributed Arithmetic for Cascaded Filters.

    When the DARadix value of a filter stage is set to 2, you can specify a serial architecture for that stage by using the SerialPartition property. For more details, see Architecture Options for Cascaded Filters.

For more details, see Distributed Arithmetic for FIR Filters.

Folding factor for IIR filter, specified as 1 or a positive integer. Use this property to define a serial architecture for direct form I or direct form II SOS filters. To reduce area in a serial architecture implementation, you can share multipliers at the cost of latency. The folding factor specifies the factor by which the clock rate increases in response to area optimization.

You can specify either the FoldingFactor property or the NumMultipliers property, but not both. If you do not specify either property, the coder generates a fully parallel architecture.

For information about the FoldingFactor options and the corresponding NumMultipliers, call the hdlfilterserialinfo function.

Number of shared multipliers for IIR filter, specified as a positive integer. Use this property to define a serial architecture for direct form I or direct form II SOS filters. Shared multipliers reduce area at the cost of an increased clock rate.

You can specify either the NumMultipliers property or the FoldingFactor property, but not both. If you do not specify either property, the coder generates a fully parallel architecture.

For information about the NumMultipliers options and the corresponding FoldingFactor, call the hdlfilterserialinfo function.

Tips

If you use the fdhdltool function to generate HDL code, you can set the corresponding properties in the Generate HDL dialog box.

PropertyLocation in Dialog Box

Add input register

Global Settings tab > Ports tab

Add output register

Additional optimization properties

Filter Architecture tab

See also: