Polyphase filter bank and fast Fourier transform—optimized for HDL code generation
DSP System Toolbox HDL Support / Filtering
The Channelizer HDL Optimized block separates a broadband input signal into multiple narrowband output signals. It provides hardware speed and area optimization for streaming data applications. The block accepts scalar or vector input of real or complex data, provides hardware-friendly control signals, and has optional output frame control signals. You can achieve giga-sample-per-second (GSPS) throughput using vector input. The block implements a polyphase filter, with one subfilter per input vector element. The hardware implementation interleaves the subfilters, which results in sharing each filter multiplier (FFT Length / Input Size) times. The FFT implementation uses the same pipelined Radix 2^2 FFT algorithm as the FFT HDL Optimized block.
dataIn
— Input dataThe vector size must be a power of 2 that is from 1 to 64, and is not greater than the number of channels (FFT length).
double
and single
data
types are supported for simulation, but not for HDL code generation.
The block does not accept uint64
data.
Data Types: fixed point
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| single
| double
Complex Number Support: Yes
validIn
— Indicates valid input dataWhen validIn is true
, the
block captures the value on dataIn.
Data Types: Boolean
reset
— Reset control signal (optional)When reset is true
, the block
stops the current calculation and clears internal state.
To enable this port, select Enable reset input port.
Data Types: Boolean
dataOut
— Frequency channel output dataIf you set Output vector size to
Same as number of frequency
bands
(default), the output data is a
1-by-M vector where
M is the FFT length.
If you set Output vector size to
Same as input size
, the
output data is an M-by-1 vector where
M is the input vector size.
The output order is bit natural for either output size. The output data type is a result of the Filter output data type and the bit growth in the FFT necessary to avoid overflow.
validOut
— Indicates valid output dataThe block sets validOut to
true
with each valid sample on
dataOut.
Data Types: Boolean
startOut
— Indicates the first valid cycle of output data (optional)The block sets startOut to
true
during the first valid sample on
dataOut.
To enable this port, select Enable start output port.
Data Types: Boolean
endOut
— Indicates the last valid cycle of output data (optional)The block sets endOut to true
during the last valid sample on dataOut.
To enable this port, select Enable end output port.
Data Types: Boolean
Number of frequency bands (FFT length)
— FFT length8
(default) | integer power of twoFor HDL code generation, the FFT length must be a power of 2 from 23 to 216.
Filter coefficients
— Polyphase filter coefficients[ -0.032, 0.121, 0.318, 0.482, 0.546, 0.482,
0.318, 0.121, -0.032 ]
(default) | vector of numeric valuesIf the number of coefficients is not a multiple of Number of
frequency bands (FFT length), the block pads this vector
with zeros. The default filter specification is a raised-cosine FIR
filter, rcosdesign(0.25,2,4,'sqrt')
. You can specify
a vector of coefficients or a call to a filter design function that
returns the coefficient values. Complex coefficients are not supported.
By default, the block casts the coefficients to the same data type as
the input.
Complex multiplication
— HDL implementation of complex multipliersUse 4 multipliers and 2
adders
(default) | Use 3 multipliers and 5 adders
HDL implementation of complex multipliers, specified as either
'Use 4 multipliers and 2 adders'
or 'Use
3 multipliers and 5 adders'
. Depending on your synthesis
tool and target device, one option may be faster or smaller.
This option applies only if you use the Radix 2^2 architecture.
Output vector size
— Size of output dataSame as number of frequency
bands
(default) | Same as input size
The output data is a row vector of M-by-1 channels. The output order is bit natural for either output size.
Same as number of frequency
bands
— Output data is a
1-by-M vector, where
M is the FFT length.
Same as input size
—
Output data is an M-by-1 vector, where
M is the input vector size.
Divide butterfly outputs by two
— FFT scalingWhen you select this parameter, the FFT implements an overall 1/N scale factor by scaling the result of each pipeline stage by 2. This adjustment keeps the output of the FFT in the same amplitude range as its input. If scaling is disabled, the FFT avoids overflow by increasing the word length by 1 bit at each stage.
Rounding mode
— Rounding method used for internal fixed-point calculationsFloor
(default) | Ceiling
| Convergent
| Nearest
| Round
| Zero
See Rounding Modes. The block
uses fixed-point arithmetic for internal calculations when the input is
any integer or fixed-point data type. This option does not apply when
the input is single
or double
.
Each FFT stage rounds after the twiddle factor multiplication but before
the butterflies. Rounding can also occur when casting the coefficients
and the output of the polyphase filter to the data types you specify.
Saturate on integer overflow
— Overflow handling for internal fixed-point calculationsSee Overflow Handling. The block
uses fixed-point arithmetic for internal calculations when the input is
any integer or fixed-point data type. This option does not apply when
the input is single
or double
.
This option applies to casting the coefficients and the output of the
polyphase filter to the data types you specify.
The FFT algorithm avoids overflow by either scaling the output of each
stage (Normalize
enabled), or by increasing the
word length by 1 bit at each stage (Normalize
disabled).
Coefficient data type
— Data type of the filter coefficientsInherit: Same word length as
input
(default) | data type expressionThe block casts the polyphase filter coefficients to this data type,
using the rounding and overflow settings you specify. When you select
Inherit: Same word length as input
(default), the block selects the binary point using
fi()
best-precision rules.
Filter output data type
— Data type of the output of the polyphase filterInherit: Same word length as
input
(default) | Inherit: via internal rule
| data type expressionThe block casts the output of the polyphase filter (the input to the
FFT) to this data type, using the rounding and overflow settings you
specify. When you select Inherit: Same word length as
input
(default), the block selects a best-precision
binary point by considering the values of your filter coefficients and
the range of your input data type.
By default, the FFT logic does not modify the data type. When you disable Divide butterfly outputs by two, the FFT increases the word length by 1 bit at each stage to avoid overflow.
Enable reset input port
— Optional reset signalWhen you select this parameter, the reset port
shows on the block icon. When the reset input is
true
, the block stops calculation and clears all
internal state.
Enable start output port
— Optional control signal indicating start of dataWhen you select this parameter, the startOut port shows on the block icon. The startOut signal is true for the first cycle of output data in a frame.
Enable end output port
— Optional control signal indicating end of dataWhen you select this parameter, the endOut port shows on the block icon. The endOut signal is true for the last cycle of output data in a frame.
The polyphase filter algorithm requires a subfilter for each FFT channel. For more detail on the polyphase filter architecture, refer to [1], and to the Channelizer block reference page.
Note
The output of the Channelizer HDL Optimized block does not match the output from the Channelizer block sample-for-sample. This mismatch is because the blocks apply the input samples to the subfilters in different orders. The Channelizer HDL Optimized block applies input X(0) to subfilter EM-1(z), X(1) to subfilter EM-2(z), ..., X(M-1) to subfilter E0(z). The channels detected by both blocks match, when analyzed over multiple frames.
If the input vector size, M, is the same as the FFT length, N, then the block implements N subfilters in the hardware. Each subfilter is a direct-form transposed FIR filter with NumCoeffs/N taps.
If the vector size is less than N, the block implements one subfilter for each input vector element. The subfilter multipliers are shared as necessary to implement N channel filters. The shared multiplier taps have a lookup table for N/M filter coefficients. Each tap is followed by a delay line of N/M–1 cycles.
The output of the subfilters is cast to the specified Filter output data type, using the rounding and overflow settings you chose. Each filter tap in the subfilter is pipelined to target the DSP sections of an FPGA.
For instance, for an FFT length of 8, and an input vector size of 4, the block implements four filters. Each multiplier is shared N/M times, or twice. Each tap applies two coefficients, and the delay line is N/M–1 cycles.
For scalar input, the block implements one filter. Each multiplier is shared N times. Each tap applies N coefficients, and the delay line is N–1 cycles.
The latency varies with FFT length and vector size. After you update the model, the latency is displayed on the block icon. The displayed latency is the number of cycles between the first valid input and the first valid output, assuming that the input is contiguous. The filter coefficients do not affect the latency. Setting the output size equal to the input size reduces the latency, because the samples are not saved and reordered.
This diagram shows validIn
and validOut
signals
for contiguous input data with a vector size of 16 and an FFT length
of 512.
The diagram also shows the optional startOut
and endOut
signals that indicate frame boundaries. When enabled, startOut
pulses for
one cycle with the first validOut
of the frame, and
endOut
pulses for one cycle with the last validOut
of the frame.
If you apply continuous input frames (no gap in validIn
between frames),
the output will also be continuous, after the initial latency.
The validIn
signal can be noncontiguous. Data accompanied by a
validIn
signal is stored until a frame is filled. Then the data is
output in a contiguous frame of N (FFT length) cycles. This diagram shows
noncontiguous input and contiguous output for an FFT length of 512 and a vector size of 16
samples.
These resource and performance data are the place-and-route results from the generated HDL targeted to a Xilinx® Virtex® 6 (XC6VLX240-1ff784) FPGA. The three examples in the tables use this configuration:
FFT length (default) — 8
Filter length — 96 coefficients
16-bit complex input data
Coefficient and filter output data types (default) — Same as number of frequency bands
Complex multiplication (default) — 4 multipliers, 2 adders
Output scaling — Enabled
Minimize clock enables (HDL Coder™ parameter)
Performance of the synthesized HDL code varies with your target and synthesis options.
For scalar input, the design achieves a clock frequency of 346 MHz. The latency is 53 cycles. The subfilters share each multiplier eight (N) times. The design uses these resources.
Resource | Number Used |
---|---|
LUT | 1591 |
FFS | 2681 |
Xilinx LogiCORE® DSP48 | 16 |
For four-sample vector input, the design achieves a clock frequency of 333 MHz. The latency is 31 cycles. The subfilters share each multiplier twice (N/M). The design uses these resources.
Resource | Number Used |
---|---|
LUT | 1912 |
FFS | 3986 |
Xilinx LogiCORE DSP48 | 56 |
For eight-sample vector input, the design achieves a clock frequency of 292 MHz. The latency is 20 cycles. When the input size is the same as the FFT length, the subfilters do not share any multipliers. The design uses these resources.
Resource | Number Used |
---|---|
LUT | 1388 |
FFS | 2302 |
Xilinx LogiCORE DSP48 | 110 |
[1] Harris, F. J., C. Dick, and M. Rice. “Digital Receivers and Transmitters Using Polyphase Filter Banks for Wireless Communications.” IEEE Transactions on Microwave Theory and Techniques. Vol. 51, No. 4, April 2003.
This block supports C/C++ code generation for Simulink® accelerator and rapid accelerator modes and for DPI component generation.
This block has a single, default HDL architecture.
ConstrainedOutputPipeline | Number of registers to place at
the outputs by moving existing delays within your design. Distributed
pipelining does not redistribute these registers. The default is
|
InputPipeline | Number of input pipeline stages
to insert in the generated code. Distributed pipelining and constrained
output pipelining can move these registers. The default is
|
OutputPipeline | Number of output pipeline stages
to insert in the generated code. Distributed pipelining and constrained
output pipelining can move these registers. The default is
|
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
Select web siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.