An Explicit Estimated Baseline Model for Robust Estimation of Fluorophores Using Multiple-Wavelength Excitation Fluorescence Spectroscopy

Spectroscopy is a popular technique for identifying and quantifying fluorophores in fluorescent materials. However, quantifying the fluorophore of interest can be challenging when the material also contains other fluorophores (baseline), particularly if the emission spectrum of the baseline is not well-defined and overlaps with that of the fluorophore of interest. In this work, we propose a method that is free from any prior assumptions about the baseline by utilizing fluorescence signals at multiple excitation wavelengths. Despite the nonlinearity of the model, a closed-form expression of the least squares estimator is also derived. To evaluate our method, we consider the practical case of estimating the contributions of two forms of protoporphyrin IX (PpIX) in a fluorescence signal. This fluorophore of interest is commonly utilized in neuro-oncology operating rooms to distinguish the boundary between healthy and tumor tissue in a type of brain tumor known as glioma. Using a digital phantom calibrated with clinical and experimental data, we demonstrate that our method is more robust than current state-of-the-art methods for classifying pathological status, particularly when applied to images of simulated clinical gliomas. To account for the high variability in the baseline, we are examining various scenarios and their corresponding outcomes. In particular, it maintains the ability to distinguish between healthy and tumor tissue with an accuracy of up to 87%, while the ability of existing methods drops near 0%.


An Explicit Estimated Baseline Model for Robust Estimation of Fluorophores Using
Multiple-Wavelength Excitation Fluorescence Spectroscopy A. Gautheron , M. Sdika , M. Hébert , and B. Montcel Abstract-Spectroscopy is a popular technique for identifying and quantifying fluorophores in fluorescent materials.However, quantifying the fluorophore of interest can be challenging when the material also contains other fluorophores (baseline), particularly if the emission spectrum of the baseline is not well-defined and overlaps with that of the fluorophore of interest.In this work, we propose a method that is free from any prior assumptions about the baseline by utilizing fluorescence signals at multiple excitation wavelengths.Despite the nonlinearity of the model, a closed-form expression of the least squares estimator is also derived.To evaluate our method, we consider the practical case of estimating the contributions of two forms of protoporphyrin IX (PpIX) in a fluorescence signal.This fluorophore of interest is commonly utilized in neurooncology operating rooms to distinguish the boundary between healthy and tumor tissue in a type of brain tumor known as glioma.Using a digital phantom calibrated with clinical and experimental data, we demonstrate that our method is more robust than current state-of-the-art methods for classifying pathological status, particularly when applied to images of simulated clinical gliomas.To account for the high variability in the baseline, we are examining various scenarios and their corresponding outcomes.In particular, it maintains the ability to distinguish between healthy and tumor tissue with an accuracy of up to 87%, while the ability of existing methods drops near 0%.

I. INTRODUCTION
F LUORESCENCE is a non-linear optical process in which a chemical compound absorbs light at a specific wavelength and subsequently emits light at different wavelengths.In many materials including biological tissues, fluorescence involves a variety of fluorescent molecules.Some of these compounds may serve as can be relevant biomarkers in biomedical applications, by analyzing their time response after excitation or emission spectrum.Studying the temporal fluorescence response of biological tissues enables the estimation of fluorophore concentrations fitting decay models [1], or the estimation of neural activation spikes [2].Fluorescence spectroscopy analyzes the spectral signal emitted by fluorophores.This technique offers several advantages for characterizing molecular reactions or interactions.For example, the spectral fluorescence response of biological samples can enable the detection of to detect breast cancer [3] or residual tumors in the vicinity of the surgical site [4].However, as in [5] and [1], and in most practical cases, it is necessary to extract the temporal or spectral response of the fluorophores of interest from a global signal may include the fluorescence signal of other compounds.The base component signals can either be assumed to be known [6], assumed to have a specific signal shape (such as a Gaussian) [7], [8], or estimated through blind unmixing routines, such as the nonnegative matrix factorization algorithm [9], [10], [11], [12].In other words, a measured signal is of a linear combination of variable signals from fluorophores that are located near the measurement probe.We are interested in characterizing the specific contribution of a fluorophore of interest.Blind unmixing algorithms are generic and can be applied to various domains [1], [10], [13], [14] However, in our case, because they have limitations because they cannot accurately isolate contributions that are specific to a particular fluorophore of interest.
The sensitivity of fluorescence spectroscopy remains mainly limited by the presence of other fluorophores.The variability of the fluorescent response of these other fluorophores depends a lot on external factors [8], [15], [16], [17], [18], [19], [20], [21], [22], [23].The high variability of these fluorophores can lead to significant crosstalk with the fluorophore of interest.Omitting a fluorophore, poorly modeling it, or using an incorrect reference spectrum affects the detection of specific fluorescence signals from the fluorophores of interest.Crosstalk events cause an overestimation of the amplitude of the target fluorophores, leading to an overestimation of the contributions of the fluorophores of interest.This leads to a drop in specificity when interest biomarker contributions are utilized in a classification pipeline [19], [22].
To eliminate crosstalk, the general approach is to include a baseline signal in the model that encompasses all factors unrelated to the fluorophores of interest.Existing approaches are effective when the emission spectral band of the baseline is far from the one of the fluorophores of interest.In these cases, the analytical a priori on the baseline is an exponential shape function [15], [24].When the emission spectral band of the baseline is close to or within one of the fluorophores of interest, the existing methods remain more limited and very often expertdependent.In these cases, the baseline is modeled by a Gaussian function [8], or functions so-called expert, which contains a finite number of fluorophores of interest whose fluorescence signal shape is assumed known [22], [23], [25].All these models have strong limitations in the face of the variability of baseline fluorophores or when some baseline fluorophores emit too close to the wavelength range of the fluorophore of interest.Despite these limitations, fluorescence spectroscopy has shown its relevance for several applications in neuro-oncology [26], [27], [28], [29], [30], [31], food analysis [32] or biological studies at cell level [33].
Another approach suggested in [10] is to use the blind-source separation technique as a spectral unmixing routine.The main improvement of this technique is to estimate the baseline instead of modeling it.But this technique has existing limitations first as it requires an expert a priori to correctly tune hyperparameters like the number of sources.In addition, this technique is highly time-consuming as it relies on an iterative solver.
We propose in this study a novel approach to estimate the interest biomarkers' contributions.To relieve hyperparameters determination and the a priori on the spectral shape of the emission by the baseline, our method uses several excitation wavelengths, which is in contrast with the single excitation used in literature so far.This additional measurement allows us to estimate the baseline that must be fixed in a single measurement problem.However, considering multiple excitation wavelengths results in a non-linear estimation problem that in general does not have a closed-form solution.For the specific case of two distinct wavelengths, we were able to find the conditions for the problem to have a unique solution and to derive an analytic expression of the solution for all the variables.This makes our method computationally efficient, free from parameter tuning and cumbersome iterative estimation procedures.
To evaluate our method, we consider the practical case of the estimation of the contributions of two forms of Protoporphyrin IX (PpIX) in a fluorescence signal.This fluorophore of interest is widely used in the operating theatre in neuro-oncology to characterize the healthy/tumoral boundary of a brain tumor named glioma [34], [35], [36].Glioma is the most frequent and aggressive primary brain tumor whose main therapy is complete tumor removal surgery [37] followed by Stupp protocol [38].Thus, the neurosurgeon uses this information on PpIX contributions to remove or not the measured tissue.
Various 5-aminolevulinic acid (5-ALA) induced PpIX fluorescence spectroscopy methods have been proposed to overcome sensitivity issues of fluorescence microscopy applied to lowdensity infiltrative parts of High-Grade Glioma (HGG) [25], [39] or to Low-Grade Glioma (LGG) [40].However, the high levels of crosstalk due to the presence of other non-5-ALA-induced fluorophores largely impair specificity, because healthy samples are classified as tumoral.In Section II, we first describe the new estimation method using multiple excitations and derive the closed-form solution of the associated estimation problem.In Section III, we describe the numerical phantom calibrated on real HGG and LGG data [8], [41].Then, numerical results show that our method improves the robustness of the estimation of PpIX contributions and thus the specificity of the classification.Especially in cases where existing methods fail, e.g. when other fluorophores emit in the same spectral domain as PpIX.Finally, we compare the performance of our method with other methods issued from the literature in terms of their ability to maintain significant specificity in detecting the healthy/margins boundary on synthetic images of LGG and HGG calibrated on a real experimental system [8].Our method keeps a specificity of 87% while other models' specificity drops to 0% at best.

A. Physical Model
The spectrally resolved signal of the fluorescence emission used in this study relies on multiple excitation wavelengths.The spectrally resolved signal of the fluorescence measured signal m can be written as: where λ e is the fluorescence excitation wavelength, S i (λ) ∈ R p is the normalized emission spectrum of a fluorophore of interest, η i (λ e ) ∈ R the quantum yield at excitation wavelength λ e , α i ∈ R the contribution of the i th fluorophore of interest.b(λ) ∈ R p , the baseline represents other endogenous fluorophore emission spectra whose amplitude concerning the excitation wavelength is modeled by γ(λ e ) ∈ R. Each emission spectrum S i (λ) has been normalized by its integral over the entire wavelength range.S i (λ) refers to the signal of the fluorophores of interest and η i corresponds to the amplitude variation due to the different excitation wavelengths.γ represents the baseline amplitude variation due to the different excitation wavelengths, and b(λ) is the baseline i.e. the residual signal of all but the fluorophores of interest.Fluorophore estimation is the problem of estimating α i .In this problem, S i (λ) and η i are supposed to be the known, α i , γ and b(λ) are to be estimated.
One emphasizes that the main assumption in (1) and hereinafter is that the fluorescence baseline keeps the same emission spect,ral shape whatever the excitation wavelength is.This assumption is based on the fact that the emission spectra are generally independent of the excitation wavelength [42].Please note that in the presence of k distinct excitation wavelengths, the condition n + k − 1 ≤ (k − 1)p is required for the problem to be either overdetermined or perfectly determined.We assume hereinafter that the previous condition is fulfilled.Moreover, it is assumed that p k, e.g.p = 819 and k = 2, as in Part III.For the sake of concreteness, we consider only the case of n = 2 fluorophores of interest and k = 2 excitations wavelengths: Defining contributions and the estimated measurements vector as α = (α 1 , α 2 ) T ∈ R 2 and m = (m T , m T ) T ∈ R 2p , (2) can be written in a matrix form as where

1) Estimated Baseline Model (EB):
Our estimator of the fluorophore contributions, called the Estimated Baseline model (EB), is the least square solution of the (3): where m = (m T , m T ) T ∈ R 2p×1 is the experimental measurements vector.In this optimization problem, α, b and γ are the variables to estimate and all other parameters are given.We stress that, as opposed to the state-of-the-art, the Estimated Baseline model is considered as a semi-blind method by assuming that the spectral shape of the baseline, which is estimated directly from the data, is the same at two excitation wavelengths.

2) Analytic Resolution of Estimated Baseline Model:
The interaction between γ and b make the Estimated Baseline Model non-linear.However, we will show in this section that it is still possible to find a closed-form expression for α.The problem is treated in two steps: the least-square solution (α γ , b γ ) for a given γ is first derived and then the optimal γ is found: For a fixed γ, matrix M γ is constant and the problem is linear.The optimal (α, b) is given by An expression for M + γ can be obtained using a formula for the inverse of block matrices, 2.3 of [43].M T γ M γ is given by: and if Defining and finally, the pseudo inverse M + is given by: where Note that this expression is valid only when S is full rank and E g is invertible, i.e, when γ / ∈ {ρ 1 , ρ 2 }.These two conditions are in agreement with the physics of the problem: S is full-ranked when the fluorophores of interest are not correlated, invertibility of E g means that the baseline and the fluorophores of interest are not correlated.The two excitation wavelengths should be chosen to ensure that γ at these wavelengths is different from the ratios of the fluorophores of interest.From a signal processing point of view, this condition creates specific hot points for γ where α cannot be estimated.Noises in the measurements m and m enlarges these hot points into instability regions.It is assumed hereinafter that these conditions are satisfied.
3) Analytical Estimation of the Parameters: Using expression (9) for M + γ , the least-square solution for α and b for a given γ are given by: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

4) Residual Calculation and γ
Estimation: Equation ( 5) can now be written: where Using ( 10) and ( 11), one obtains: and from which the following expression of φ can be deduced: where q = (I − SS + )m and q = (I − SS + )m .As φ is a rational function with no real pole and goes to +∞ at ±∞, the optimal γ is a zero of: As the discriminant of the numerator is positive, there are two distinct roots: The second derivative of φ is positive for the optimal γ and: and consequently and we conclude that = 1 i.e. γ + is the optimal γ.

III. VALIDATION ON SIMULATIONS
In this section, we consider the practical case of the estimation of contributions of PpIX in a fluorescence signal.This fluorophore of interest is widely used in the operating theatre in neuro-oncology to characterize the healthy/tumoral boundary of a brain tumor named glioma [34], [35].Glioma is the most frequent and aggressive primary brain tumor whose main therapy is complete tumor removal surgery [37].Thus, the neurosurgeon uses this information on PpIX contributions to remove or not the measured tissue.Now, we compare the proposed method with state-of-the-art methods, including blind sources separation method when possible.We consider the two different forms of PpIX described in [15].Besides the wellknown PpIX fluorescence emission appearing with a maximum at 634 nm under excitation at 405 nm, another fluorescence band can be observed with a maximum at 620 nm under specific conditions [15].These two bands, hereinafter labeled PpIX620 and PpIX634, can be interpreted as bands corresponding to two different forms of PpIX [8], [15], [20], both ones having some interest.Parameters with subscript 1 denote quantities related to PpIX620 and parameters with subscript 2 denote quantities related to PpIX634.
We compare our method with multiple excitation models extrapolated from those of the literature.We define a digital phantom calibrated on real High-Grade Gliomas (HGG) and Low-Grade Gliomas (LGG) data to get a realistic simulation environment.
We now distinguish LGG with two pathological statuses, LGG and Healthy, from HGG with four pathological statuses: Solid Part, Dense Margins, Sparse Density Margins, and Healthy.
The baseline is defined as the non-5-ALA induced fluorescence, meaning the fluorescence that would be emitted in the same tissue without prior administration of 5-ALA to the patient.We assume that a pre-processing of the measurements has already taken place to subtract autofluorescence away from the spectral band of interest as proposed in the literature [15], [24].Thus, the resulting baseline considers only non-5-ALA-induced fluorophores whose emission is almost overlapping with the one of PpIX.The performance of the EB method is estimated for various conditions of baseline fluorophores whose emission spectrum overlaps the one of PpIX.Performance metrics that we use are the error of estimated PpIX contributions and the accuracy, sensitivity, and specificity of classification in the pathological status.Finally, the clinical diagnostic performance is assessed using glioma images synthesized by an experimentally calibrated digital phantom.

A. Models for Comparison Purpose
Using the same notations as in Section II-B, we now describe multiple excitation models extrapolated from those of the literature.

1) Baseline Free Model (BF):
In the Baseline Free model (BF), the baseline contribution is discarded.In the existing literature [15], [24] after having subtracted an exponential shape function, a least-square regression is used to estimate the fluorescence contributions of PpIX into the fluorescence signal.BF 1 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
refers to the Baseline Free Model with one excitation wavelength, which can be written in matrix form as follows: BF 2 refers to the Baseline Free Model with two excitation wavelengths.It can be written in matrix form as follows: In both cases, the parameters are estimated using a linear least square regression.

2) Gaussian Baseline Model (GB):
In the Gaussian Baseline model (GB) [8], the contribution of the baseline spectrum into the fluorescence signal is a Gaussian function and a leastsquare method is used to estimate PpIX contributions α 1 , α 2 .GB 1 refers to the Gaussian Baseline Model with one excitation wavelength, which is defined in (22).GB 2 refers to the Gaussian Baseline Model with two excitation wavelengths, which can be written as follows: where α 1 , α 2 , A, A , μ, σ are estimated by a bounded least squares curve fitting method with Bounds have been defined from [8].

3) Non-Negative Matrix Factorization (NMF):
We compared our results with a blind spectral unmixing method, known as non-negative matrix factorization (NMF) [44], [45], [46], using the nnmf internal routine of MATLAB 2022b.In the same trend as the 2 in [12], the 1-by-2p matrix mT of the measured signal is factored into two nonnegative matrix W (1-by-n ) and H (n -by-2p) such as mT ≈ W × H.For a small set of n nonnegative representative vectors concatenated in H, W represents their weights of mT .In the case of spectral unmixing, n can also be seen as the number of different sources sought.For the rest of this article, we will set n = n + 1 to 3: PpIX620, PpIX634, and the baseline.On a practical aspect, the stopping criteria and initial values of W and H were set to default values.The initial value of W was a randomly generated array of numbers between 0 and 1.The initial value of H was set to known reference spectra and a 1-dimensional randomly generated array of numbers between 0 and 1 for the baseline.Experiments have also been performed using randomly generated arrays of numbers between 0 and 1 as the initial values of H and W .They lead to slightly poorer classification results which do not change the conclusion.The stopping criteria were either a residual value lower than or equal to 1e -4 or a relative change in the elements of W and H lower than or equal to 1e -4 .

4) Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS):
Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) is a group of techniques for recovering pure response profiles [47] (spectra, pH profiles, time profiles,....) of chemical constituents or species from unresolved mixtures obtained in chemical processes when no prior information is available on the nature and composition of these mixtures.MCR-ALS methods have been extended to the analysis of many types of experimental data [48], including multi-channel data and non-evolutionary processes, such as abundance extraction for Multispectral Fluorescence Lifetime Imaging Microscopy [49].Practically, we used the toolbox described hereafter [50], using known reference spectra and a 1-dimensional randomly generated array of numbers between 0 and 1 for the baseline.The precision was set to 0.01% and no constraints were set on spectra and concentration, as for the EB method.The maximum iterations boundary was set to 100, which was considered as high enough because the iteration process stopped at iteration 22 caused by reaching the set precision.Experiments have also been performed using the pure method of extraction of initial spectra.In that case, the iteration process stopped at iteration 52 caused by reaching the set precision.They lead to slightly poorer classification results which do not change the conclusion.

B. Fluorescence Digital Phantom
To make a fair comparison of the new EB method with existing ones, we use a fluorescence digital phantom.This phantom includes PpIX and also additive fluorophores which compose the baseline.The digital phantom is parameterized from in vivo data [8] and experimental data obtained on liquid PpIX phantom made as in [20], [51].
We consider two excitation wavelengths λ e = 405 nm and λ e = 385 nm which are the respective maxima of the absorption spectrum of the two forms of PpIX considered here; the quantum yield ratio for PpIX620 is η 1 /η 1 = 1.62; for PpIX634, it is η 2 /η 2 = 0.76, referring to [41].The measurements and reference spectra used in the generation and the processing related to this part have a dimension p = 819.To make the baseline fluorescence spectrum neutral towards the studied models, the latter is modeled as a Lorentzian function with various parameters: the quantum yield γ(λ e ), central emission wavelength λ para , standard deviation σ para , and amplitude A.
A two-part noise calibrated on experimental data is included in the fluorescence simulation phantom: a Poisson noise P mimics photon noise and an additive Gaussian white noise N mimics electronic noise.This leads to the following function representing the generated spectrum: where All results given in this work have been computed over 10 5 α random draws equally split into 100 × 100 bins to map the region α ∈ [0, 2] × [0, 2].Each region of the α plane is associated with a pathological status (cf.Fig. 1).We focus on the boundary between healthy and tumoral tissues, which corresponds to an area of this plane.All random draws were performed with Gaussian distributions linked with pathological status, whose parameters are extracted from previous in vivo works [8].

C. Sets of Baseline Parameters
Seven baseline datasets have been simulated.These datasets correspond to different scenarios for the application of our method in clinical practice.The simulation parameters are given in Table I where U (a, b) is a uniform distributionbetween a and b, and Γ(k, β) is a gamma distribution with shape factor k and scale factor β.
The first three datasets of parameters are defined to study the robustness regarding the amplitude or width parameters of the fluorescence spectrum attached to the baseline.The Outside and Inside datasets study the influence of the position of the baseline central wavelength regarding the spectral emission band of PpIX.The Inner and Across datasets are defined to study the impact of the baseline fluorescence quantum yield.This parameter is particularly critical regarding the instability areas of EB (see Section II-B2).The Inner dataset models baseline quantum yields ranging out of the PpIX instability spots, whereas the Across set models the intersection of the baseline and PpIX quantum yields ranges.An example of fluorescence spectrum for some of the baseline sets is plotted in Fig. S1 of the Supplementary Material.

D. Classification Method
To reach the final aim of pathological status prediction, one evaluated the ability to classify tissues from the estimated contributions of PpIX inside the fluorescence spectrum.The same simulated measurements m have been independently processed by each estimation model, and estimated PpIX contributions α * have been classified with a naive Bayesian classifier [56].Knowledge of classes given in [8] has been used to define a priori probability laws associated with each class.Ground truth PpIX contributions α used in the phantom have also been classified to serve as a reference for classification performance.For each bin, we compute the accuracy acc by using the formula below where c is a class, c * an estimated class, C the set of classes, s a sample, S c the set of samples in class c, c * s the estimated class for the sample s, N the total number of samples in a bin, and N c the number of samples of the bin that belong to the class c.For each bin of any map, the color is related to the accuracy value, the greener the higher accuracy.
p(c) follows a gaussian probability law whose parameters are defined in [8].The accuracy is computed in LGG or HGG cases.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.For an HGG, the estimated parameter pair of each measurement may be associated with a most likely class between Core, Dense Margins, Sparse Margins, or Healthy.For an LGG, the same classification may occur for the two following classes: Low Grade, i.e. tumoral, or Healthy.In Fig. 1, we computed the probability map of classes of the HGG case, resp.LGG, to help interpret the maps.This allows us to associate a most likely class to each region of the (α 1 , α 2 ) plane.The region in the bottom left corner of each map corresponds to the healthy tissue region.One notices that the probability 0.5 represents the border between two different classes of the α plane region.We also compare our results with blind spectral unmixing methods such as non-negative matrix factorization (NMF) and the Multivariate Curve Resolution-Alternating Least Squares(MCR-ALS).Since the latter do not necessarily provide unmixing results in (α 1 , α 2 ) space, the comparison was performed after the final output of the spectral unmixing followed by classification process, in Sections III-E2 and III-E3.We chose not to proceed with output space adaptation of the spectral unmixing method to keep a fair comparison between all methods.To perform the classification, it was necessary to train separate Bayesian classifiers for HGG and LGG on the general dataset for the NMF, resp.MCR-ALS results.The training of the four separate Bayesian classifiers (two for NMF and two for MCR-ALS) is conducted using 45 items for each class.To compare the classification results of each state-of-the-art method, we opted to generate receiver operating characteristic (ROC) curves and calculate the area under the curve (AUC).The higher the AUC, the better the classifier.The AUC was computed after using NMF, MCR-ALS, or estimation models.

E. Results and Discussion
Fluorescence signal by non-5-ALA-induced fluorophores varies a lot, which leads to important crosstalk with PpIX.Each baseline set of parameters represents one example among all possible spectra.we now look at the estimation error on the parameters of interest with every model and for all these sets.

1) Estimation Error on α:
In Fig. 2 are presented the mean value and the standard error of the absolute estimation error between the ground truth α used to generate spectra with the numerical phantom and α * estimated using a given method on a specific baseline set of parameters (see Table I).One sees that with every model, the error increases a lot except for the EB for which the error remains rather constant.Concerning General, Amplitude, and Witdh datasets, we can notice that existing models are all quite insensitive to Width but very sensitive to Amplitude and that EB 2 remains insensitive to Width while solving the Amplitude issue.Considering the Outside dataset, existing models work well because the emission band of the baseline is far from the PpIX emission spectrum: crosstalk is low, and baseline and PpIX are decorrelated.Comparing the Inside dataset with the Outside one, we notice that the error made by state-of-the-art models is increased on average by at least a factor of 5.At the same time, the EB model varies very little: the error is on average increased by a factor of 1.24.EB is not very sensitive to Inside/Outside datasets since it has no a priori on the spectrum of the baseline while existing methods are very sensitive to the baseline emission band.
For Inner Dataset, EB has the lowest error and, at the same time, other models are not better than the General case.The Inner Dataset is chosen to be favorable for EB since the baseline quantum yield γ is decorrelated from those of PpIX.Because existing models do not consider it, they are insensitive to changes of γ.On the contrary, Across Dataset is chosen to be unfavorable for EB.One notices the significant increase in EB's error when the quantum yield of the baseline approaches those of PpIX.This is a direct consequence of specific hot areas explained in the previous Section II-B2.It seems to point out that the case modeled by the Across dataset is not very plausible physically.because it considers a fluorophore with a quantum yield γ close to PpIX without the spectral shape corresponding to the one of PpIX.Moreover, if this case happens, one solution to tackle this issue can be to change λ e , so the hot points will be outside the area of interest.
EB is not always the model that has the lowest error, especially when the a priori are in favor of existing models (see for example the Outside dataset).In any case, it maintains a low error even when the amplitude increases.The Across case is not very plausible physically.A more plausible physical case where fluorophores are very close to PpIX would be a merge of Across and Inside datasets.We assume that all the models would become bad because the case is very difficult for all models, even EB would not solve this case.
2) Impact of α * on Classification: In Fig. 3, one sees with both LGG and HGG that with the Amplitude dataset, EB is the only model able to correctly classify Healthy samples when the baseline amplitude increases.Comparing single-and twoexcitation wavelength models for the Amplitude dataset, one can notice that two excitation wavelength models (GB 2 , BF 2 ) improve a little the correct classification of Healthy class samples located in the center of the region.It is confirmed by an increase of BF 2 , resp.GB 2 , accuracy compared to BF 1 , resp.GB 1 in Fig. 4. Thus, the success of EB is more due to the estimated baseline than the multiple excitation wavelengths.In the case of HGG, the border between Low-Density Margins, High-Density Margins, and Core is thicker for GB 1 , GB 2 and BF 1 , BF 2 models than for EB which is a direct consequence of the estimation error by these models, refer to Fig. 2. In addition, the shading effect in Fig. 3 reveals a loss in classification precision for core and margins classes.In the case of LGG, the ability to classify is lost because there are only two classes.For the Width dataset, we notice that EB is the only model whose accuracy remains constant with both LGG and HGG.Thus, EB seems to be the most relevant model if the baseline amplitude or width has an important variability.
We now study the impact of the baseline central wavelength on accuracy (see Fig. 4).Considering the Outside dataset, all models have similar performance.One can notice for the Inside dataset that EB is the only model whose accuracy overpasses 75%.Despite the estimation error being higher in EB than in existing models, the classification performance is equivalent in the Outside dataset where there is no crosstalk between baseline and PpIX fluorescence spectra.We can thus consider using EB even when there is no crosstalk issue since it is as good as the state-of-the-art methods.This result is striking because EB was not particularly expected to perform in this case.This result reveals the strength of EB compared to other models: it is less sensitive to the emission band of the baseline.
Let us assess the impact of baseline fluorescence quantum yield on accuracy.For the Inner dataset, the accuracy is the same as in the General dataset for all models.Thus, the drop of error with EB, in this case, does not affect the classification.
Since the strength of the estimated baseline model is given by the estimated baseline, one now wants to compare it to blind spectral unmixing methods.We chose NMF as the comparison method.However, since the estimated output parameters of each model are not in the same space, we observe the receiver operating characteristic curve and the area under this curve for each method.Fig. 5(b), resp.Fig. 5(a), represents the area under the curve (AUC) for the different methods and datasets in the HGG case, resp.LGG case.
The General dataset is the classifier training dataset for NMF.One notices that in terms of AUC on this dataset the NMF method performs worse than our Estimated Baseline model for both the LGG and HGG cases.
As the baseline's amplitude increases in the dataset Amplitude, the AUC of the NMF method drops as much as that of the state-of-the-art models.When the baseline's width increases in the dataset Width, the AUC of the NMF method is reduced a little compared to the Estimated Baseline Model's one.
Considering the Outside dataset, all models have similar performance for LGG.For HGG, NMF performs somewhat worse than all other methods, including EB.For the Inside dataset, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the AUC of NMF and peak methods are lower than GT's AUC whereas the EB and GT's AUC are similar for both LGG and HGG.
One focuses on the effect of baseline quantum yield on the AUC.For the Inner dataset, the AUC is the same as for the General dataset for all models: the AUC of NMF is lower than the one of EB.For the Across dataset associated with HGG, we note that the AUC of NMF remains lower than the one of EB.For LGG, NMF has an AUC in the same range as the one of EB.
Indeed, EB has a better accuracy and a higher AUC than the existing models including NMF when the baseline amplitude increases, and it keeps an accuracy and AUC equivalent to those of existing models in the unfavorable case of Across dataset where the baseline quantum yield γ is close to those of the PpIX.One can imagine using EB all the time even if in cases modeled by the Across dataset since at worst it will be equivalent to state-of-the-art performances.

3) Clinical Practice Performance on Synthesized Glioma
Image: To measure the final impact of this study on clinical practice, we apply the "estimation + classification" pipeline to an image of glioma considering a real experimental system.Each glioma is either LGG (Fig. 6(a)) or HGG (Fig. 6(b)).Clinically, the average size of the glioma is 3 cm, we set its size to 75% of image size so the pixel size corresponds to 130 μm which corresponds to the typical scale at the tissue level.A single spectrum is generated in each pixel according to the pixel's ground truth class.Thus, the emission spectrum of the baseline and the contributions of the fluorophore of interest are different between the two neighboring pixels.The size of a measurement probe is 4 mm (30 pixels in the image) which corresponds to the typical size of a spectroscopic probe or hyperspectral imaging system in the literature [8].Images of Fig. 6 account for the spatial point spread function of the probe: the initial 300 × 300 × 2 × p hyperspectral images have been padded with replicates of the borders of the original content and then convolved with a Gaussian kernel of size 30 × 30 pixels.Finally, the resulting 300 × 300 × 2 × p hyperspectral images, called Clinical Synthetic Images (CSI), are clipped to the original size.For all images, the estimation is performed independently pixel by pixel.To numerically estimate the classification, we simplify the problem into two classes, healthy or tumor tissue.We thus calculate the sensitivity also called true positive rate (TPR) and specificity called true negative rate (TNR) of each classification for all datasets.These metrics are relevant because sensitivity refers to the model's ability to correctly detect tumor tissues, while specificity relates to the model's ability to correctly detect Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Section III-E2 because the convolution effect of the probe is not considered.One notice that the classification task is an ill-posed problem because the sensitivity and specificity of ground truth are unable to reach 1.0.
We first focus on the Amplitude and General Datasets.Corresponding sensitivity (TPR) and specificity (TNR) values are given in Table IIa and IIb.We notice a strong impact of the measurement probe's convolution on the classification sensitivity (TPR) which goes from approximately 80% to 100%.In the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
same way, the specificity (TNR) increases from 59% to 73% at best.This is relevant in terms of clinical practice since the neurosurgeon removes millimeter-sized pieces that therefore give very high sensitivity.Moreover, previous results are confirmed on simulated glioma: only EB can correctly distinguish healthy tissues from tumoral ones when the amplitude of the baseline increases.
One can see a loss of specificity when comparing the Inside with the Outside dataset for all models except EB.This goes along with previous results.In addition, the specificities of the Outside dataset are higher than in the General dataset.This is due to the difference in baseline positions between the two datasets.Moreover, in the HGG case, we notice that the specificity of GB 1 and GB 2 is better than the one of Ground Truth.This bias of the method has no physical sense and is because the Lorentzian is narrower than the Gaussian.
Thus, when the Lorentzian is positioned where the Gaussian of the GB models can be (i.e. the outside Dataset), then the Gaussian baseline of GB models has higher values than the Lorentzian at the level of the PpIX peaks.As a result, α is systematically underestimated and we have this effect of a fall in the number of true healthy identified as tumoral and therefore an increase in specificity.
Looking at Inner and Across datasets, we notice that the specificity of each model is not impacted by the change related to these datasets.EB has the best specificity in the unfavorable case of Across dataset where the baseline quantum yield γ is close to those of the PpIX.From a performance point of view, the EB method is 1.06 times faster than the blind methods which are finally not very accurate, and 1706 times faster than the accurate state-of-the-art methods (GB 2 ).From a clinical point of view, EB is the best of all models in all studied cases.

IV. CONCLUSION
In this work, we proposed a novel approach using several excitation wavelengths to estimate the PpIX fluorescence-related biomarkers contributions which make our method free from any a priori on the fluorescence emission spectrum of the baseline.
Furthermore, we derived analytical expressions for the least square estimator of the proposed model.This allows our method to be computationally efficient and free from the use of iterative non-linear estimation procedures.We have shown that this new model better estimates the pathological status of the measured tissue when baseline amplitude increases: it keeps a specificity almost equal to that of the ground truth while the one of existing models drops to 0. In addition, this new model is as accurate as existing models in unfavorable cases, e.g., when baseline quantum yield is near PpIX ones.This new model reveals the potential of multiple excitation wavelengths to increase the classification sensitivity of fluorescence measurements.
Future work may focus on hot spot management and extend this study to more than two excitation wavelengths.In addition, complementary aims will be explored regarding the application of the model to real clinical data and the correlation between the baseline and the fluorophores of interest.Furthermore, the characterization of the execution speed depending on the image size and the size p of the spectra could be explored in relation to clinical constraints.This work assumes the independence of the baseline spectral shape with the excitation wavelength, which is consistent with the physics of fluorescence.However, complex microenvironments could induce mechanisms that alter this assumption, and the EB model's robustness to such effect could be explored.As described in [36], 5-ALA-induced PpIX is the clinical gold standard for HGG despite the associated financial constraints.In neurosurgery, compared with other fluorescent agents, 5-ALA-induced PpIX is an agent that specifically labels tumor cells.For LGG, the use of 5-ALA as a surgical adjunct remains more limited mainly by limited PpIX visible fluorescence, which reduces the signal-to-noise ratio of the measurements, and the post-processing applied gives poorer results.The EB model presented and characterized in this article aims at better differentiating the PpIX 620 and 634 forms, which are present together in cases where the fluorescence signal is weak.Finally, the EB model could be applied to other fluorophores of clinical interest, particularly to try to determine the relative quantities of NADH protein-bound form and free one.

Fig. 1 .
Fig. 1.Display level set of probability map to be in the class of the given color.In (a), LGG class corresponds to the red color and the Healthy class to the green one.In (b), the Core class corresponds to the black color, Dense Margins to the red color, Sparse Margins to the blue color, and the Healthy class to the green one.In each figure, the horizontal axis corresponds to α 1 and the vertical axis to α 2 .(a) LGG Case.(b) HGG Case.

Fig. 2 .
Fig.2.Mean and standard error of absolute estimation error of α.Each color bar corresponds to an estimation method, and each group corresponds to a set of baseline parameters (refer TableI).BF 1 , resp.GB 1 , BF 2 , GB 2 refers to the Baseline Free Model with one excitation wavelength, resp.The Gaussian Baseline Model with one excitation wavelength, the Baseline Free Model with two excitation wavelengths, and the Gaussian Baseline Model with two excitation wavelengths.The Estimated Baseline Model (EB) is our proposed method.

Fig. 3 .
Fig. 3. Accuracy Map for each estimation method in column : GT, resp.BF 1 , GB 1 , BF 2 , GB 2 refers to Ground Truth, resp.The Baseline Free Model with one excitation wavelength, and the Gaussian Baseline Model with one excitation wavelength, the Baseline Free Model with two excitation wavelengths, and the Gaussian Baseline Model with two excitation wavelengths.The Estimated Baseline Model (EB) is our proposed method.We classify on one side Low Grade Glioma (a) and on the other side High Grade Glioma (b).Columns give the set name defined in Table I: the baseline's amplitude increased when going from General to Amplitude dataset.(a) LGG Case.(b) HGG Case.

Fig. 4 .
Fig. 4. Mean Accuracy of the region of interest (0 α 1 0.25, 0 α 2 0.75).Each color corresponds to the same model and each group to a specific set of baseline parameters.GT, resp.BF 1 , GB 1 , BF 2 , GB 2 refers to Ground Truth, resp.The Baseline Free Model with one excitation wavelength, and the Gaussian Baseline Model with one excitation wavelength, the Baseline Free Model with two excitation wavelengths, and the Gaussian Baseline Model with two excitation wavelengths.The Estimated Baseline Model (EB) is our proposed method.(a) LGG Case.(b) HGG Case.

Fig. 5 .
Fig. 5. Area under the receiver operating characteristic curve (AUC).Each color corresponds to the same model and each group to a specific set of baseline parameters.GT, resp.BF 1 , GB 1 , BF 2 , GB 2 refers to Ground Truth, resp. the Baseline Free Model with one excitation wavelength, the Gaussian Baseline Model with one excitation wavelength, the Baseline Free Model with two excitation wavelengths, and the Gaussian Baseline Model with two excitation wavelengths.The AUC of the Non-Negative Matrix Factorization (NMF) spectral unmixing method, resp.The Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) method, is represented in each group by the royal blue bar, resp.by the dark blue bar.The Estimated Baseline Model (EB) is our proposed method.(a) LGG Case.(b) HGG Case.

Fig. 6 .
Fig. 6. Results on a 300 × 300 clinical synthetic images of Low Grade Glioma (a) and High Grade Glioma (b).Row name corresponds to an estimation models : GT refers to Ground Truth, BF 1 , GB 1 , BF 2 , GB 1 refer to state of the art methods and EB 2 .Columns gives the set name defined in Table I: the baseline's amplitude increased when going from General to Amplitude dataset.Red color refers to tumor area while green color refers to healthy area.In HGG Case (b), orange color refers to sparse tumor margins.(a) LGG Case.(b) HGG Case.

TABLE I PARAMETERS
USED FOR THE SIMULATION OF THE SEVEN BASELINE DATASETS Table I).BF 1 , resp.GB 1 , BF 2 , GB 2 refers to the Baseline Free Model with one excitation wavelength, resp.The Gaussian Baseline Model with one excitation wavelength, the Baseline Free Model with two excitation wavelengths, and the Gaussian Baseline Model with two excitation wavelengths.The Estimated Baseline Model (EB) is our proposed method.

TABLE II SENSITIVITY
(TPR) AND SPECIFICITY (TNR) VALUES FOR DIFFERENT DATASETS INCLUDING CLINICAL SYNTHETIC IMAGES (CSI) ONES healthy tissues.Case General 1px of Table II is equivalent to