Linear spectral unmixing of Fluorescence spectra

Question

Thomas on 16 Dec 2014

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/166822-linear-spectral-unmixing-of-fluorescence-spectra

Commented: Avery on 21 Jul 2023

Hy, I am trying to write a program for linear spectral unmixing with known endmembers. I have a fluorescence spectrum and the subspectra of the endmembers (all separately measured). What I know want to do is numerically estimate the intensities of the subspectra. According to a least square approximation, with a linear spectral unmixinf model (S = A*S1 + B*S2 + C*S3....). I have 9 endmembers in the spectrum. As I have never done any numerical analysis, could someone point me in the right direction to start with?

Many Thanks T.

2 Comments
Show NoneHide None

Image Analyst on 16 Dec 2014

What is an endmember? Can you give a numerical example? The subspectra are all curves, right? Like plots of narrow spikes of emittance versus wavelength? And you have another broad spectrum curve for S and you want to find the weights (A, B, etc.) such that a weighted sum of the S1, S2, ... curves equals your S curve, correct?

Thomas on 16 Dec 2014

An endmember is a substance I know to be in the tissue. So all of the 9 subspectra are broadband spectra. The peaks are approx 120nm broad, while the whole spectrum is 800nm broad. They - unfortunately - overlap highly. I do have the normalized fluorescence emittance (Amplitude) of all of the subspectra. Exactly, my curve is S and consist of 9 subspectra that are linear overlapped. I want to find A,B,C, etc.

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 16 Dec 2014

5
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/166822-linear-spectral-unmixing-of-fluorescence-spectra#answer_162533

Edited: John D'Errico on 16 Dec 2014

Open in MATLAB Online

Let me wade in here. From your question, you have a measured aggregate spectrum, and on the side, measured components that you will assume of which the aggregate is composed. Since they are measured, they are NOT Gaussian components, which is often only a poor approximation to the shape of those components. (Gaussians are symmetric and they have a very specific shape.) And since you have them measured, there seems no reason to approximate them with Gaussians anyway.

So, given a "function", F, sampled at a set of discrete set of wavelengths. Thus you have the measured spectrum at a set of n wavelengths. At those same wavelengths, you have 9 separate components, I'll call them S_i. Actually, F is a discretely sampled function of wavelength, lambda, as are the components.

You now pose the mixture model for F,

F = a_1*S_1 + a_2*S_2 + a_3*S_3 + ... + a_9*S_9

Thus at any wavelength, the measured spectrum is presumed to be some (unknown) linear combination of the measured component sub-spectra. You wish to estimate the component fractions perhaps as a vector

A = [a_1; a_2; ... ; a_9]

Logically, the a_i will be constrained to be non-negative, an issue I'll discuss at some length below. I've defined A as a column vector because that is how most code would return it in MATLAB.

The simple approach to estimation of the mixture coefficients in A is to use a basic linear regression. Here we would minimize the sum of squares of the residuals for the mixture model. The simple solution to that is:

Assuming columns vectors for F and the S_i that are all the same lengths, then define the n by 9 matrix S where the columns of S are the 9 component subspectra.

S = [S_1,S_2,S_3,S_4,S_5,S_6,S_7,S_8,S_9];

Then if F is also a column vector of length n,

A = S\F;

This is a simple linear regression (not unlike that which regress would return), and it will work acceptably SOME of the time, but it will fail terribly on occasion, because it employs no non-negativity constraints on the coefficients in A.

The point is, a negative component makes no physical sense. You cannot have a negative amount of some sub-spectra in the mixture, yet the simple linear regression will probably yield exactly that. It will happen because you have some noise in the measurement process, because your measured spectra were not perfectly measured, or because you might have some contribution from something you have not actually measured (often described as lack-of-fit), or for a few other problems I'm forgetting to mention. The point is, it WILL happen.

A negative component here might indicate a serious problem in your data, or it might be just trash. So it is always a good thing to look at the coefficients you would generate, to look at the resulting fit. Plot the residuals. Is there significant lack of fit?

Anyway, a more logical and better solution is to use a non-negative least squares solution. MATLAB offers such a solver in the form of lsqnonneg.

A = lsqnonneg(S,F);

A will now be a vector with non-negative components, that yield the best possible solution, subject to non-negativity constraints. In fact, sometimes some of the components of A may have some TINY negative numbers in them, on the order of eps, so roughly -1e-16 or so. That is floating point trash, and nothing to worry about here, but if it bothers you, just use

A = max(0,lsqnonneg(S,F));

instead. (This is where knowing something about numerical analysis helps, in knowing when you can safely discard something as trash, and when it is potentially important.) A nice thing about lsqnonneg is you have it in basic MATLAB, with no toolboxes required.

There are other ways you can do the estimation. One approach would be to minimize the sum of absolute differences of the model residuals, instead of a sum of squares of residuals. This can be achieved using a linear programming tool (linprog for example) with some slack variables as I recall. (I know I have a solver written for that problem somewhere laying around.) The difference between the lsqnonneg solution and the linprog solution will probably not be that important here, so I would just recommend lsqnonneg.

Finally, there is the question of estimates of the standard deviation of the parameters. One nice thing about a tool like regress is it will offer estimated standard deviations for the estimated mixture coefficients. The problem with those estimated uncertainties is they are based on an approximation that fails when the problem was a bounded one. So if some of your coefficients are zero or near zero, or worse, negative, those standard deviations are no longer really meaningful. While there are statistical techniques in existence that will try to offer better estimates of the uncertainty in your coefficients, regress won't help you here.