Algorithms to Determine Fixed-Point Types for Real Least-Squares Matrix Solve AX=B

Open Live Script

This example shows the algorithms that the fixed.realQRMatrixSolveFixedpointTypes function uses to analytically determine fixed-point types for the solution of the real least-squares matrix equation $A X = B$ , where $A$ is an $m$ -by- $n$ matrix with $m \geq n$ , $B$ is $m$ -by- $p$ , and $X$ is $n$ -by- $p$ .

Overview

You can solve the fixed-point least-squares matrix equation $A X = B$ using QR decomposition. Using a sequence of orthogonal transformations, QR decomposition transforms matrix $A$ in-place to upper triangular $R$ , and transforms matrix $B$ in-place to $C = Q^{'} B$ , where $Q R = A$ is the economy-size QR decomposition. This reduces the equation to an upper-triangular system of equations $R X = C$ . To solve for $X$ , compute $X = R \ C$ through back-substitution of $R$ into $C$ .

You can determine appropriate fixed-point types for the least-squares matrix equation $A X = B$ by selecting the fraction length based on the number of bits of precision defined by your requirements. The fixed.realQRMatrixSolveFixedpointTypes function analytically computes the following upper bounds on $R$ , $C = Q^{'} B$ , and $X$ to determine the number of integer bits required to avoid overflow [1,2,3].

The upper bound for the magnitude of the elements of $R$ is

$\max (| R (:) |) \leq \sqrt{m} \max (| A (:) |)$ .

The upper bound for the magnitude of the elements of $C = Q^{'} B$ is

$\max (| C (:) |) \leq \sqrt{m} \max (| B (:) |)$ .

The upper bound for the magnitude of the elements of $X = A \ B$ is

$\max (| X (:) |) \leq \frac{\sqrt{m} \max (| B (:) |)}{\min (svd (A))}$ .

Since computing $svd (A)$ is more computationally expensive than solving the system of equations, the fixed.realQRMatrixSolveFixedpointTypes function estimates a lower bound of $\min (svd (A))$ .

Fixed-point types for the solution of the matrix equation $A X = B$ are generally well-bounded if the number of rows, $m$ , of $A$ are much greater than the number of columns, $n$ (i.e. $m ≫ n$ ), and $A$ is full rank. If $A$ is not inherently full rank, then it can be made so by adding random noise. Random noise naturally occurs in physical systems, such as thermal noise in radar or communications systems. If $m = n$ , then the dynamic range of the system can be unbounded, for example in the scalar equation $x = a / b$ and $a, b \in [- 1, 1]$ , then $x$ can be arbitrarily large if $b$ is close to $0$ .

Proofs of the Bounds

Properties and Definitions of Vector and Matrix Norms

The proofs of the bounds use the following properties and definitions of matrix and vector norms, where $Q$ is an orthogonal matrix, and $v$ is a vector of length $m$ [6].

$\begin{array}{lcl} | | A v | |_{2} & \leq & | | A | |_{2} | | v | |_{2} \\ | | Q | |_{2} & = & 1 \\ | | v | |_{\infty} & = & \max (| v (:) |) \\ | | v | |_{\infty} & \leq & | | v | |_{2} \leq \sqrt{m} | | v | |_{\infty} \end{array}$

If $A$ is an $m$ -by- $n$ matrix and $Q R = A$ is the economy-size QR decomposition of $A$ , where $Q$ is orthogonal and $m$ -by- $n$ and $R$ is upper-triangular and $n$ -by- $n$ , then the singular values of $R$ are equal to the singular values of $A$ . If $A$ is nonsingular, then

$| | R^{- 1} | |_{2} = | | (R^{'})^{- 1} | |_{2} = \frac{1}{\min (svd (R))} = \frac{1}{\min (svd (A))}$

Upper Bound for R = Q'A

The upper bound for the magnitude of the elements of $R$ is

$\max (| R (:) |) \leq \sqrt{m} \max (| A (:) |)$ .

Proof of Upper Bound for R = Q'A

The $j$ th column of $R$ is equal to $R (:, j) = Q^{'} A (:, j)$ , so

$\begin{array}{rcl} \max (| R (:, j) |) & = & | | R (:, j) | |_{\infty} \\ \leq & | | R (:, j) | |_{2} \\ = & | | Q^{'} A (:, j) | |_{2} \\ \leq & | | Q^{'} | |_{2} | | A (:, j) | |_{2} \\ = & | | A (:, j) | |_{2} \\ \leq & \sqrt{m} | | A (:, j) | |_{\infty} \\ = & \sqrt{m} \max (| A (:, j) |) \\ \leq & \sqrt{m} \max (| A (:) |) . \end{array}$

Since $\max (| R (:, j) |) \leq \sqrt{m} \max (| A (:) |)$ for all $1 \leq j$ , then

$\max (| R (:) |) \leq \sqrt{m} \max (| A (:) |) .$

Upper Bound for C = Q'B

The upper bound for the magnitude of the elements of $C = Q^{'} B$ is

$\max (| C (:) |) \leq \sqrt{m} \max (| B (:) |)$ .

Proof of Upper Bound for C = Q'B

The proof of the upper bound for $C = Q^{'} B$ is the same as the proof of the upper bound for $R = Q^{'} A$ by substituting $C$ for $R$ and $B$ for $A$ .

Upper Bound for X = A\B

The upper bound for the magnitude of the elements of $X = A \ B$ is

$\max (| X (:) |) \leq \frac{\sqrt{m} \max (| B (:) |)}{\min (svd (A))}$ .

Proof of Upper Bound for X = A\B

If $A$ is not full rank, then $\min (svd (A)) = 0$ , and if $B$ is not equal to zero, then $\sqrt{m} \max (| B (:) |) / \min (svd (A)) = \infty$ and so the inequality is true.

If $A$ is full rank, then $x = R^{- 1} (Q^{'} b)$ . Let $x = X (:, j)$ be the $j$ th column of $X$ , and $b = B (:, j)$ be the $j$ th column of $B$ . Then

$\begin{array}{rcl} \max (| x (:) |) & = & | | x | |_{\infty} \\ \leq & | | x | |_{2} \\ = & | | R^{- 1} \cdot (Q^{'} b) | |_{2} \\ \leq & | | R^{- 1} | |_{2} | | Q^{'} | |_{2} | | b | |_{2} \\ = & (1 / \min (svd (A))) \cdot 1 \cdot | | b | |_{2} \\ = & | | b | |_{2} / \min (svd (A)) \\ \leq & \sqrt{m} | | b | |_{\infty} / \min (svd (A)) \\ = & \sqrt{m} \max (| b (:) |) / \min (svd (A)) . \end{array}$

Since $\max (| x (:) |) \leq \sqrt{m} \max (| b (:) |) / \min (svd (A))$ for all rows and columns of $B$ and $X$ , then

$\max (| X (:) |) \leq \frac{\sqrt{m} \max (| B (:) |)}{\min (svd (A))}$ .

Lower Bound for min(svd(A))

You can estimate a lower bound $s$ of $\min (svd (A))$ for real-valued $A$ using the following formula,

$s = σ_{N} \sqrt{2 γ^{- 1} (\frac{p_{s} Γ (m - n + 1) Γ (n / 2)}{2^{m - n} Γ (\frac{m + 1}{2}) Γ (\frac{m - n + 1}{2})}, \frac{m - n + 1}{2})}$

where $σ_{N}$ is the standard deviation of random noise added to the elements of $A$ , $1 - p_{s}$ is the probability that $s \leq \min (svd (A))$ , $Γ$ is the gamma function, and $γ^{- 1}$ is the inverse incomplete gamma function gammaincinv.

The proof is found in [1][2]. It is derived by integrating the formula in Lemma 3.3 from [4] and rearranging terms.

Since $s \leq \min (svd (A))$ with probability $1 - p_{s}$ , then you can bound the magnitude of the elements of $X$ without computing $svd (A)$ ,

$\max (| X (:) |) \leq \frac{\sqrt{m} \max (| B (:) |)}{\min (svd (A))} \leq \frac{\sqrt{m} \max (| B (:) |)}{s}$ with probability $1 - p_{s}$ .

You can compute $s$ using the fixed.realSingularValueLowerBound function which uses a default probability of 5 standard deviations below the mean $p_{s} = (1 + erf (- 5 / \sqrt{2})) / 2 \approx 2.8665 \cdot 1 0^{- 7}$ , so the probability that the estimated bound for the smallest singular value $s$ is less than the actual smallest singular value of $A$ is $1 - p_{s} \approx 0.9999997$ .

Example

This example runs a simulation with many random matrices and compares the analytical bounds with the actual singular values of $A$ and the actual largest elements of $R = Q^{'} A$ , $C = Q^{'} B$ , and $X = A \ B$ .

Define System Parameters

Define the matrix attributes and system parameters for this example.

m is the number of rows in matrices A and B. In a problem such as beamforming or direction finding, m corresponds to the number of samples that are integrated over.

m = 300;

n is the number of columns in matrix A and rows in matrix X. In a least-squares problem, m is greater than n, and usually m is much larger than n. In a problem such as beamforming or direction finding, n corresponds to the number of sensors.

n = 10;

p is the number of columns in matrices B and X. It corresponds to simultaneously solving a system with p right-hand sides.

p = 1;

In this example, set the rank of matrix A to be less than the number of columns. In a problem such as beamforming or direction finding, $rank (A)$ corresponds to the number of signals impinging on the sensor array.

rankA = 3;

precisionBits defines the number of bits of precision required for the matrix solve. Set this value according to system requirements.

precisionBits = 24;

In this example, real-valued matrices A and B are constructed such that the magnitude of their elements is less than or equal to one. Your own system requirements will define what those values are. If you don't know what they are, and A and B are fixed-point inputs to the system, then you can use the upperbound function to determine the upper bounds of the fixed-point types of A and B.

max_abs_A is an upper bound on the maximum magnitude element of A.

max_abs_A = 1;

max_abs_B is an upper bound on the maximum magnitude element of B.

max_abs_B = 1;

Thermal noise standard deviation is the square root of thermal noise power, which is a system parameter. A well-designed system has the quantization level lower than the thermal noise. Here, set thermalNoiseStandardDeviation to the equivalent of $- 50$ dB noise power.

thermalNoiseStandardDeviation = sqrt(10^(-50/10))

thermalNoiseStandardDeviation = 
0.0032

The standard deviation of the noise from quantizing the elements of a real signal is $2^{- precisionBits} / \sqrt{12}$ [4,5]. Use the fixed.realQuantizationNoiseStandardDeviation function to compute this. See that it is less than thermalNoiseStandardDeviation.

quantizationNoiseStandardDeviation = fixed.realQuantizationNoiseStandardDeviation(precisionBits)

quantizationNoiseStandardDeviation = 
1.7206e-08

Compute Fixed-Point Types

In this example, assume that the designed system matrix $A$ does not have full rank (there are fewer signals of interest than number of columns of matrix $A$ ), and the measured system matrix $A$ has additive thermal noise that is larger than the quantization noise. The additive noise makes the measured matrix $A$ have full rank.

Set $σ_{noise} = σ_{thermal noise}$ .

noiseStandardDeviation = thermalNoiseStandardDeviation;

Use fixed.realQRMatrixSolveFixedpointTypes to compute fixed-point types.

T = fixed.realQRMatrixSolveFixedpointTypes(m,n,max_abs_A,max_abs_B,...
    precisionBits,noiseStandardDeviation)

T = struct with fields:
    A: [0×0 embedded.fi]
    B: [0×0 embedded.fi]
    X: [0×0 embedded.fi]

T.A is the type computed for transforming $A$ to $R$ in-place so that it does not overflow.

T.A

ans = 

[]

          DataTypeMode: Fixed-point: binary point scaling
            Signedness: Signed
            WordLength: 31
        FractionLength: 24

T.B is the type computed for transforming $B$ to $Q^{'} B$ in-place so that it does not overflow.

T.B

ans = 

[]

          DataTypeMode: Fixed-point: binary point scaling
            Signedness: Signed
            WordLength: 31
        FractionLength: 24

T.X is the type computed for the solution $X = A \ B$ so that there is a low probability that it overflows.

T.X

ans = 

[]

          DataTypeMode: Fixed-point: binary point scaling
            Signedness: Signed
            WordLength: 36
        FractionLength: 24

Upper Bounds for R and C=Q'B

The upper bounds for $R$ and $C = Q^{'} B$ are computed using the following formulas, where $m$ is the number of rows of matrices $A$ and $B$ .

$\max (| R (:) |) \leq \sqrt{m} \max (| A (:) |)$

$\max (| C (:) |) \leq \sqrt{m} \max (| B (:) |)$

These upper bounds are used to select a fixed-point type with the required number of bits of precision to avoid overflows.

upperBoundR = sqrt(m)*max_abs_A

upperBoundR = 
17.3205

upperBoundQB = sqrt(m)*max_abs_B

upperBoundQB = 
17.3205

Lower Bound for min(svd(A)) for Real A

A lower bound for $\min (svd (A))$ is estimated by the fixed.realSingularValueLowerBound function using a probability that the estimate $s$ is not greater than the actual smallest singular value. The default probability is 5 standard deviations below the mean. You can change this probability by specifying it as the last input parameter to the fixed.realSingularValueLowerBound function.

estimatedSingularValueLowerBound = fixed.realSingularValueLowerBound(m,n,noiseStandardDeviation)

estimatedSingularValueLowerBound = 
0.0371

Simulate and Compare to the Computed Bounds

The bounds are within an order of magnitude of the simulated results. This is sufficient because the number of bits translates to a logarithmic scale relative to the range of values. Being within a factor of 10 is between 3 and 4 bits. This is a good starting point for specifying a fixed-point type. If you run the simulation for more samples, then it is more likely that the simulated results will be closer to the bound. This example uses a limited number of simulations so it doesn't take too long to run. For real-world system design, you should run additional simulations.

Define the number of samples, numSamples, over which to run the simulation.

numSamples = 1e4;

Run the simulation.

[actualMaxR,actualMaxQB,singularValues,X_values] = runSimulations(m,n,p,rankA,max_abs_A,max_abs_B,...
    numSamples,noiseStandardDeviation,T);

You can see that the upper bound on $R$ compared to the measured simulation results of the maximum value of $R$ over all runs is within an order of magnitude.

upperBoundR

upperBoundR = 
17.3205

max(actualMaxR)

ans = 
8.3029

You can see that the upper bound on $C = Q^{'} B$ compared to the measured simulation results of the maximum value of $C = Q^{'} B$ over all runs is also within an order of magnitude.

upperBoundQB

upperBoundQB = 
17.3205

max(actualMaxQB)

ans = 
2.5707

Finally, see that the estimated lower bound of $\min (svd (A))$ compared to the measured simulation results of $\min (svd (A))$ over all runs is also within an order of magnitude.

estimatedSingularValueLowerBound

estimatedSingularValueLowerBound = 
0.0371

actualSmallestSingularValue = min(singularValues,[],'all')

actualSmallestSingularValue = 
0.0420

Plot the distribution of the singular values over all simulation runs. The distributions of the largest singular values correspond to the signals that determine the rank of the matrix. The distributions of the smallest singular values correspond to the noise. The derivation of the estimated bound of the smallest singular value makes use of the random nature of the noise.

clf
fixed.example.plot.singularValueDistribution(m,n,rankA,noiseStandardDeviation,...
    singularValues,estimatedSingularValueLowerBound,"real");

Figure contains an axes object. The axes object with title Singular value distributions for 300 -by- 10 real matrices of rank 3 with sigma indexOf noise baseline = 0 . 00316, xlabel Singular value magnitude, ylabel Probability contains 20 objects of type line, text.

Zoom in to smallest singular value to see that the estimated bound is close to it.

xlim([estimatedSingularValueLowerBound*0.9, max(singularValues(n,:))]);

Estimate the largest value of the solution, X, and compare it to the largest value of X found during the simulation runs. The estimation is within an order of magnitude of the actual value, which is sufficient for estimating a fixed-point data type, because it is between 3 and 4 bits.

This example uses a limited number of simulation runs. With additional simulation runs, the actual largest value of X will approach the estimated largest value of X.

estimated_largest_X = fixed.realMatrixSolveUpperBoundX(m,n,max_abs_B,noiseStandardDeviation)

estimated_largest_X = 
466.5772

actual_largest_X = max(abs(X_values),[],'all')

actual_largest_X = 
44.8056

Plot the distribution of X values and compare it to the estimated upper bound for X.

clf
fixed.example.plot.xValueDistribution(m,n,rankA,noiseStandardDeviation,...
    X_values,estimated_largest_X,"real normally distributed random");

Figure contains an axes object. The axes object with title X distributions for 300 -by- 10 matrices of rank 3 with sigma indexOf noise baseline = 0 . 00316, xlabel X value magnitude, ylabel Probability contains an object of type line.

Supporting Functions

The runSimulations function creates a series of random matrices $A$ and $B$ of a given size and rank, quantizes them according to the computed types, computes the QR decomposition of $A$ , and solves the equation $A X = B$ . It returns the maximum values of $R = Q^{'} A$ and $C = Q^{'} B$ , the singular values of $A$ , and the values of $X$ so their distributions can be plotted and compared to the bounds.

function [actualMaxR,actualMaxQB,singularValues,X_values] = runSimulations(m,n,p,rankA,max_abs_A,max_abs_B,...
        numSamples,noiseStandardDeviation,T)
    precisionBits = T.A.FractionLength;
    A_WordLength = T.A.WordLength;
    B_WordLength = T.B.WordLength;
    actualMaxR = zeros(1,numSamples);
    actualMaxQB = zeros(1,numSamples);
    singularValues = zeros(n,numSamples);
    X_values = zeros(n,numSamples);
    for j = 1:numSamples
        A = max_abs_A*fixed.example.realRandomLowRankMatrix(m,n,rankA);
        % Adding normally distributed random noise makes A non-singular.
        A = A + fixed.example.realNormalRandomArray(0,noiseStandardDeviation,m,n);
        A = quantizenumeric(A,1,A_WordLength,precisionBits);
        B = fixed.example.realUniformRandomArray(-max_abs_B,max_abs_B,m,p);
        B = quantizenumeric(B,1,B_WordLength,precisionBits);
        [Q,R] = qr(A,0);
        C = Q'*B;
        X = R\C;
        actualMaxR(j) = max(abs(R(:)));
        actualMaxQB(j) = max(abs(C(:)));
        singularValues(:,j) = svd(A);
        X_values(:,j) = X;
    end
end

Tips

These algorithms perform well when the probability of overflow is small. In this case, the estimated upper bound of overflow probability is close to the true overflow probability. When the probability of overflow is large, the estimated upper bound of overflow probability diverges and these algorithms may produce less accurate results.
Use the Data Type Agent tool to easily determine fixed-point types from within Simulink®.

References

Bryan, Thomas A., and Jenna L. Warren. "Systems and Methods for Design Parameter Selection." The MathWorks. US Patent 12,045,737 B2, issued July 23, 2024. European EP 3,944,105 A1. https://patents.google.com/patent/US12045737B2/en?oq=US+12%2c045%2c737+B2
Bryan, Thomas A., Jenna L. Warren, Shixin Zhuang, and Jessica Clayton. "Systems and Methods for Design Parameter Selection." The MathWorks. US Patent 12,008,344 B2, issued June 11, 2024. https://patents.google.com/patent/US12008344B2/en?oq=US+12%2c008%2c344+B2
Perform QR Factorization Using CORDIC. Derivation of the bound on growth when computing QR. MathWorks. 2010.
Zizhong Chen and Jack J. Dongarra. “Condition Numbers of Gaussian Random Matrices”. In: SIAM J. Matrix Anal. Appl. 27.3 (July 2005), pp. 603–620. issn: 0895-4798. doi: 10.1137/040616413. url: https://dx.doi.org/10.1137/040616413.
Bernard Widrow. “A Study of Rough Amplitude Quantization by Means of Nyquist Sampling Theory”. In: IRE Transactions on Circuit Theory 3.4 (Dec. 1956), pp. 266–276.
Bernard Widrow and István Kollár. Quantization Noise – Roundoff Error in Digital Computation, Signal Processing, Control, and Communications. Cambridge, UK: Cambridge University Press, 2008.
Gene H. Golub and Charles F. Van Loan. Matrix Computations. Second edition. Baltimore: Johns Hopkins University Press, 1989.

Suppress mlint warnings in this file.

%#ok<*NASGU>
%#ok<*ASGLU>

Algorithms to Determine Fixed-Point Types for Real Least-Squares Matrix Solve AX=B

Overview

Proofs of the Bounds

Properties and Definitions of Vector and Matrix Norms

Upper Bound for R = Q'A

Proof of Upper Bound for R = Q'A

Upper Bound for C = Q'B

Proof of Upper Bound for C = Q'B

Upper Bound for X = A\B

Proof of Upper Bound for X = A\B

Lower Bound for min(svd(A))

Example

Define System Parameters

Compute Fixed-Point Types

Upper Bounds for R and C=Q'B

Lower Bound for min(svd(A)) for Real A

Simulate and Compare to the Computed Bounds

Supporting Functions

Tips

References

See Also

Tools

Functions

Blocks

Topics