A Bayesian method for inferring quantitative information from FRET data

  • Catherine A Lichten1, 2 and

    Affiliated with

    • Peter S Swain2Email author

      Affiliated with

      BMC Biophysics20114:10

      DOI: 10.1186/2046-1682-4-10

      Received: 14 February 2011

      Accepted: 9 May 2011

      Published: 9 May 2011



      Understanding biological networks requires identifying their elementary protein interactions and establishing the timing and strength of those interactions. Fluorescence microscopy and Förster resonance energy transfer (FRET) have the potential to reveal such information because they allow molecular interactions to be monitored in living cells, but it is unclear how best to analyze FRET data. Existing techniques differ in assumptions, manipulations of data and the quantities they derive. To address this variation, we have developed a versatile Bayesian analysis based on clear assumptions and systematic statistics.


      Our algorithm infers values of the FRET efficiency and dissociation constant, K d , between a pair of fluorescently tagged proteins. It gives a posterior probability distribution for these parameters, conveying more extensive information than single-value estimates can. The width and shape of the distribution reflects the reliability of the estimate and we used simulated data to determine how measurement noise, data quantity and fluorophore concentrations affect the inference. We are able to show why varying concentrations of donors and acceptors is necessary for estimating K d . We further demonstrate that the inference improves if additional knowledge is available, for example of the FRET efficiency, which could be obtained from separate fluorescence lifetime measurements.


      We present a general, systematic approach for extracting quantitative information on molecular interactions from FRET data. Our method yields both an estimate of the dissociation constant and the uncertainty associated with that estimate. The information produced by our algorithm can help design optimal experiments and is fundamental for developing mathematical models of biochemical networks.


      Proteins work together continuously in the cells of all living things, generating cascades of reactions that are vital for life. To fully understand each individual protein's task requires discovering the timing, location, and strength of its interactions. To acquire this detailed information, fluorescence microscopy methods are ideal because they can provide dynamic, single-cell data at high spatial resolution [1, 2]. One fluorescence tool that enables researchers to observe protein interactions in living cells is Förster resonance energy transfer (FRET). FRET data has the potential to yield biochemical constants, which are critical for modeling biological systems, but measuring protein interactions from FRET data requires careful quantitative analysis.

      FRET is a physical process where a molecule in an excited energetic state (the donor) transfers energy to a nearby ground-state molecule (the acceptor). The chance that an excited donor will transfer its energy to an acceptor (known as the FRET efficiency, E fr ) depends on the distance between the donor and acceptor (which must be between 1 and 10 nm for FRET to occur), their relative orientation, and the extent to which the donor's fluorescence emission peak overlaps the excitation peak of the acceptor [3]. Optimal conditions for energy transfer occur when the distance between donor and acceptor is minimal, the molecules' electric dipoles are aligned, and the spectral overlap is significant.

      When FRET is used to study protein interactions in living cells, the proteins under investigation are fused to fluorescent tags (often variants of the green fluorescent protein) that act as the donors and acceptors. When the proteins interact, they bring the fluorescent tags together so that FRET may occur. FRET increases the number of photons emitted by acceptors and reduces both the number of photons emitted by donors and the donor's fluorescence lifetime. To observe these effects, the most common techniques for collecting FRET data include fluorescence lifetime imaging (FLIM) and using a fluorescence microscope or spectrofluorometer to record fluorescence intensity after exposing samples to light that mainly excites either donors or acceptors [46].

      For FRET data to reveal information about the underlying protein interactions, complicating factors must be dealt with. One confounding issue is spectral contamination, which arises from the requirement that the donor and acceptor must have overlapping spectra for FRET to occur. Due to the overlap, excitation light intended to excite one fluorophore may also excite the other (spectral crosstalk), and, conversely, one fluorophore may emit photons in the emission range of the other (spectral bleed-through). Several FRET analysis methods address this issue by calculating a FRET index, which is the FRET signal corrected for contamination from spectral overlap and normalised by donor or acceptor concentrations [712]. Although they are straightforward to calculate, FRET indices often bear an indirect and nonlinear relationship to the underlying concentrations and strength of protein interactions [13, 14]. Alternatively, the ratio of donors to acceptors and the apparent FRET efficiency, which is the product of the intrinsic FRET efficiency and the fraction of donors (or acceptors) in complex, can be calculated [1519, 14]. Such quantities can be measured, provided calibrations are carried out using constructs consisting of a donor linked to an acceptor [17, 19], but again do not relate directly to the dissocation constant (K d ) of an interaction.

      Another significant challenge is that both the K d and the FRET efficiency, E fr , affect the observed signals, and so neither can be determined independently of the other based on data from a single sample. E fr has been estimated from separate acceptor photobleaching [14] and FLIM experiments [15], but these approaches have drawbacks. Acceptor photo-bleaching is slow, irreversible, often fails to bleach all acceptors, and yields the intrinsic FRET efficiency only when all donors are in complex with acceptors [46]. FLIM requires specialized equipment, is often slower than standard fluorescence imaging, and analysis of FLIM data is complicated by the multiexponential fluorescence decays of fluorescent proteins [20]. There is however an alternative to directly measuring E fr : Many values of K d and E fr will be consistent with data from a single cell or sample, but, by taking a set of data from samples that contain varying concentrations of donors and acceptors and analyzing it altogether, it is possible to find the unique values of K d and E fr that are consistent with the ensemble [16, 21].

      Given these different approaches, it is not always obvious which one should be applied in different situations and there is no consensus on the statistical analysis, with each method processing the data differently and most giving no procedure to test the reliability of any estimates. A general method for inferring the K d along with the uncertainty of that inference is necessary for the standardization of quantitative FRET measurements, the design of informative experiments, and for providing in vivo parameters for developing models of protein networks [2224].

      Here we propose a systematic analysis method that explicitly includes models of the photophysics and underlying chemical interactions and of measurement noise. Building on a spectral model for FRET [25], we develop a Bayesian algorithm to infer both the interaction strength (in terms of the in vivo K d ) and the FRET efficiency. Applying our algorithm to simulated data, we gain insight into how both experimental design factors such as measurement noise, number of measurements, fluorophore concentrations and ratios, and prior information impact our estimate and its uncertainty.

      Results and Discussion


      Overview of the mathematical model

      FRET enables the study of molecular interactions in diverse settings. To design a widely applicable analysis technique, we consider a general system containing proteins (or other molecules) that form complexes and are labelled with fluorophores that act as FRET donors and acceptors (Figure 1A). We assume that all the molecules of interest are fluorescently tagged, so instead of referring to 'donor-tagged proteins' or 'acceptor-tagged proteins', we simply refer to 'donors' and 'acceptors'.
      Figure 1

      Overview of interaction and photophysical model. (A) In the underlying interaction, a donor-tagged protein binds an acceptor-tagged protein with dissociation constant K d . (B) During a FRET experiment, photophysical events occur such as FRET, which occurs between donors and acceptors in complex with efficiency E fr , and the excitation of and emissions from donors and acceptors, where the factor relating fluorescence observed in channel i to the concentration of species S is http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq1_HTML.gif . We include all possible spectral overlap effects, allowing the donor and acceptor to be excited and emit photons in all spectral channels.

      Our model relies on a few other assumptions. First, we assume that donors and acceptors may be free or form bimolecular, donor-acceptor complexes (i.e. [D] + [A] ⇌ [DA]) with dissociation constant http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq2_HTML.gif . Whether free or in complex, the fluorophores undergo fluorescence excitation and emission (Figure 1B). In complexes, the donor and acceptor are together and FRET may also occur. We assume that FRET due to random collisions between donors and acceptors occurs seldom enough to be neglected. Finally, we assume that donor-acceptor complexes all have the same FRET efficiency, which is denoted E fr and refers to the fraction of instances where exciting a donor in complex leads to excitation of the acceptor.

      For our general mathematical description of the FRET system, we use a previously described spectral mixing model [25] with slightly modified matrices. This model relates concentrations of fluorescent molecules to the fluorescence observed at various excitation wavelengths and ranges of emission wavelengths (spectral channels). It can be written as a matrix equation in the following form [6]:

      where C is a vector of the concentrations of fluorophores (C = ([D], [A], [DA]) T ), I obs is a vector of the fluorescence intensities observed for each spectral channel, and M is a matrix of the photophysical parameters that relate the concentrations of fluorophore to the observed intensities (Figure 1B).

      The observed fluorescence intensities could represent data obtained from fluorescence microscopy, spectrophotometry or a flow cytometer. Microscopy images would require standard image processing steps for quantification, such as subtracting background fluorescence and defining regions of interest such as areas within the cytosol or nucleus for localized fluorescence [26].

      Photophysics in the mathematical model

      The matrix M contains all the relevant information about the photophysical processes we model. In addition to the expected direct excitation and emission, we include resonance energy transfer, cross-talk (unintended excitation) and bleed-through (acceptors emitting in the donors' typical emission range).

      The spectral mixing framework can represent any number of spectral channels by increasing the dimensions of the matrix, but we focus on the particular case of three, which describes the most common 'three-cube' FRET experiments [12, 79, 14, 18, 17, 21, 16]. The first spectral channel, known as the donor channel, consists of an excitation wavelength that primarily excites donors and an emission filter that collects photons from the donor's emission range. The second channel, the acceptor channel, consists of an excitation wavelength that primarily excites acceptors and an emission filter that collects photons from the acceptor's emission range. The third channel, the FRET channel, combines the donor channel's excitation wavelength with the acceptor channel's emission filter, so that donors are excited preferentially and the emissions are filtered to primarily collect photons from acceptors. For three-cube FRET, the detailed form of Eq (1) is
      where http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq3_HTML.gif . The parameters http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif each represent a combination of excitation and emission information [25]:

      Here the subscript i refers to the spectral channel (for the three-cube case, it is 1 for the donor channel, 2 for the acceptor channel, and 3 for the FRET channel) and the superscript (S) refers to the fluorescent species (D, the donor; A, the acceptor; and DA, the complex). The variable I i is the illumination intensity for the excitation wavelength of channel i; http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq5_HTML.gif is the molar extinction coefficient of species (S) with excitation wavelength from channel i; Q (S) is the quantum yield of species (S); and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq6_HTML.gif is the product of the emission of species (S) at the wavelengths of channel i and the sensitivity of the detector to the emitted photons. We describe a calibration procedure for obtaining these values in the Methods.

      To illustrate Eq (2), consider the expression it yields for http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq7_HTML.gif , the intensity in the donor channel. The intensity will be the sum of contributions from free donors, free acceptors, and donor- acceptor complexes. The contribution from free donors is http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq8_HTML.gif and the contribution from free acceptors is http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq9_HTML.gif , which results from cross-talk and bleed-through. The complex, DA, can potentially produce contributions from donors not undergoing FRET http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq10_HTML.gif , from acceptors http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq11_HTML.gif and from FRET http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq12_HTML.gif . Adding these contributions together, we obtain the following expression for the predicted intensity in the donor channel, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq7_HTML.gif ,

      Continuing the matrix multiplication yields analogous expressions for http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq13_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq14_HTML.gif .

      Molecular interactions in the mathematical model

      To include K d in our model, we model the chemical equilibrium, [D] + [A] ⇌ [DA]. Defining [D 0 ] = [D] + [DA] and [A 0] = [A] + [DA], and given that http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq15_HTML.gif , we can compute the values of [DA], [D], and [A] in terms of [D 0], [A 0], and K d .

      with [D] = [D 0] - [DA] and [A] = [A 0] - [DA].

      Bayesian inference

      Our aim is to infer the values of K d and E fr from the data. To this end, Bayesian inference is an ideal tool. Given a model, it quantifies the knowledge gained from a set of experimental observations about the parameters of interest, including the uncertainty of that knowledge [27]. That is, it allows us to update the prior probability distribution, P (K d , E fr ), which represents what is known about the parameters before data has been acquired, to the posterior probability distribution, P (K d , E fr |data), which reflects both the prior information and what is learned from the data. According to Bayes' theorem,
      The likelihood function, P (data|K d , E fr ), can be explicitly expressed through our photophysical model and a model of measurement noise. It quantifies the likelihood of different values for K d and E fr given the current data. We assume that the data follows a Gaussian distribution centred around the predicted intensity given by Eq (2). For three measurements, one from each of the three spectral channels, the likelihood is:

      The observed intensities for each channel are I D , I A , and I F . http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq7_HTML.gif , http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq13_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq14_HTML.gif denote the predicted intensities and σ D , σ A and σ F measure the magnitude of the measurement noise in each channel and are approximated by the variances of the data in that channel (Methods). For fitting, we use a Gaussian approximation of Eq (7) based around the values of D 0 and A 0 that maximize the likelihood (see section on marginalization of D 0 and A 0 in the Methods). Combining the likelihood and prior probability distribution allows us to calculate the posterior probability distribution and determine the most probable values for our parameters of interest.

      Illustration of method

      To illustrate our method, we simulated data from three cells (Figure 2A, left), although in practice, data could come instead from cellular compartments or other sub-cellular regions of interest. For simplicity, each cell contains equal concentrations of donor ([D 0]) and acceptor ([A 0]). In the first cell, [D 0] = [A 0] = 0.2 μM; in the second, [D 0] = [A 0] = 1 μM; and in the third, [D 0] = [A 0] = 5 μM. Other parameter values are given in the Methods. We then simulated three-cube FRET measurements, generating data for the donor, acceptor, and FRET channels (Figure 2A, right). We made a series of ten measurements for each channel and each cell, to which we added Gaussian noise so that the standard deviation of the measurements was around 5% of the mean signal strength. We define r as the ratio of the standard deviation of the measurements to the mean signal strength; in this case, r ≈ 0.05.
      Figure 2

      Analysis of typical FRET data recovers true values of K d and E fr . A typical three-cube FRET experiment is simulated from three virtual cells, each containing the indicated concentrations of donor- and acceptor- tagged proteins (A, left). The data is summarized in bar graphs with mean ± SD (A, right). Ten measurements/channel are made for each cell with 5% added measurement noise. For other parameters, see Methods. The data was analyzed in two ways to find the values of K d and E fr consistent with the data. We first calculated the energy for an array of parameter values; as the energy contour plot shows (B), the energy was minimal near the true parameter values. We also used a Monte Carlo algorithm to explore K d -E fr space, running 3 biased random walks starting from different initial positions (white *) and running for 12,000 steps. The paths of the walks (for clarity only the first 2,000 steps are shown) are superimposed on the contour plot (B) and all three converged to a region around the true value which coincides with the energy minimum. Histograms of the locations visited by all three walks, including only post-convergence steps (11,000 steps are included from each walk) act as approximate posterior distributions for K d and E fr (C). The dashed red lines indicate the true values of K d and E fr used to generate the data.

      To obtain the posterior probability distribution that corresponds to this data for two parameters of interest, K d and E fr , we used two methods, the results of which should agree. First, because we are restricting our analysis to two parameters, we can visualize the distribution by computing the energy, the negative logarithm of the posterior probability, over a grid of values of K d and E fr . The resulting energy surface, shown in Figure 2B, was minimal around the true values of the parameters. If we were to extend our analysis to more than two parameters, including uncertainty in the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif constants for instance, we would use a Markov chain Monte Carlo (MCMC) algorithm to sample from the parameter space. To show that the MCMC algorithm gives results consistent with the energy surface, we used the algorithm to sample the posterior probability distribution for K d and E fr . We ran the algorithm three times, starting each time at different initial values. All three walks converged on the same small region of equally probable values: they are superimposed on the energy surface in Figure 2B. Both methods therefore recover the true values used to generate the data as optimal parameter values. The approximate posterior probability distributions for K d and E fr are shown in Figure 2C. The distributions are estimated as the histograms of the three walks, excluding the burn-in period before convergence [28]. They peak near the parameter values used to simulate the data, indicating that the highly probable values for K d and E fr predicted by the methods are accurate.

      Inferring other quantities from FRET data

      To further illustrate the versatility of our method, we show that we can also use our model and Bayesian framework to estimate other parameters. Two instrument-independent parameters that have been a focus of interest are the apparent FRET efficiency, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq16_HTML.gif , and the ratio, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq17_HTML.gif [1519, 14]. The analogous apparent FRET efficiency for the acceptor, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq18_HTML.gif , is the product of E d and r da . We can estimate these quantities using our method, provided that a calibration has been carried out with cells expressing only donors, only acceptors, and only a donor-acceptor construct. The FRET efficiency of the construct need not be known. Supposing we have calibrated the parameters http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif of Eq 2 relative to http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq19_HTML.gif , which is taken to be unity, then we can express Eq 2 just in terms of [A 0] with [D 0] = r da [A 0] and E fr [DA] = E d r da [A 0]. We can analytically maximize the posterior probability of r da and E d as a function of [A 0] by setting the derivative of the probability with respect to [A 0] equal to zero and solving for this optimal [A 0] in terms of r da and E d . Our MCMC algorithm can be used to sample values of r da and E d that are consistent with the data.


      Algorithm reflects data quality

      An algorithm for estimating parameter values should report not just an estimate of the most probable parameter values but also the reliability of that estimate. To evaluate this aspect of our algorithm, we measured both the error and the uncertainty of its output. We defined the error as the discrepancy between the mean of the posterior probability distribution and the true value. We calculated error as http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq20_HTML.gif , meaning that a perfect estimate would have zero error. Because under experimental conditions we would not know the true value, we also calculated the uncertainty of the estimate as an indicator of the reliability even when true values are unknown. The uncertainty corresponds to the width of the posterior probability distributions for K d and E fr . To prevent the magnitude of the parameter of interest from skewing the uncertainty, we use the relative uncertainty, calculated as the coefficient of variation (the standard deviation divided by the mean).

      We first tested our algorithm in the presence of varying levels of measurement noise. As the examples in Figure 3A suggest, when measurement noise increased, the optimal regions in K d and E fr -space grew wider and longer and the peak of the posterior probability distribution for K d often deviated from the true value (see Figure 3A inset). In Figures 3C and 3E, the error (upper plot) and uncertainty (lower plot) for estimating K d are shown and confirm that both the error and uncertainty of the estimate increased with increasing noise. Similar results were obtained for the estimate of E fr (data not shown).
      Figure 3

      K d estimates reflect data quality and quantity. We simulated and analyzed data with varying levels of measurement noise and numbers of measurements/cell/channel. The locations visited by MCMC walks (dots) for the three noise levels (A) and the three numbers of measurements/cell/channel (B) show that the highly probable region grows as measurement noise increases and as the number of measurements decreases. The 'True Value' (black and white spot) indicates values used to generate the data. Histograms of the locations visited by the walks (insets, histograms smoothed for clarity) approximate the corresponding posterior probability distributions for K d , with true values indicated by black lines. The plots of error http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq21_HTML.gif vs. measurement noise (C) and error vs. amount of data (D) illustrate that, in general, accuracy decreases with increasing noise and decreasing number of measurements although the mean of the error remains centered at zero. Even when the true value is unknown, the relative uncertainty of the parameter estimate (coefficient of variation of locations visited by a walk) is measurable; it also grows with increasing noise (E) and decreasing number of measurements/cell/channel (F). In (C-F), error bars are mean ± SD. 50 data sets were analyzed for each noise level or number of measurements, and each dataset was analyzed once with a random walk running for 20,000 steps, starting once the walk converged. Except when otherwise indicated, data had 10 measurements/cell/channel and 5% added Gaussian noise. For other parameters, see Methods.

      Next, we verified that gathering more data would improve the quality of the estimates of K d and E fr . Such data could be obtained by, for example, collecting several images of the same sample or dividing a region of interest into subregions. The examples in Figure 3B show that with more measurements, the optimal region became smaller and the peak of the posterior probability distribution for K d moved closer to the true value (see Figure 3B inset). Figures 3D and 3F show, as expected, that increasing the number of measurements decreases the error and uncertainty for the K d estimate. The effect of data quantity on the error and uncertainty of the E fr estimate was similar (data not shown).

      Overall, our algorithm reliably reflects the quality of the data analyzed in the uncertainty it gives of its estimate and accurately infers the values of K d and E fr for 10 measurements per cell per channel with 10% measurement noise. We also note that in Figures 3C and 3D, the mean error in K d remained close to zero, indicating that neither measurement noise nor number of measurements systematically biased the location of the peak.

      The total amount of donor, [D 0 ], and the total amount of acceptor, [ A 0 ], affect inference of K d and E fr

      Inferring unique values of K d and E fr requires variations in [D0] and [A0]

      A challenge in analyzing FRET data is that both E fr and K d affect the extent of FRET that may occur, making it difficult to tease out the contribution of each parameter. In general, for data from a single pair of donor and acceptor concentrations, a weak E fr and strong K d can give rise to the same observations as a strong E fr and weak K d . However, because our algorithm can analyze data from multiple cells at once, it can overcome this challenge.

      To demonstrate the problem, we analyzed a set of simulated data from a single sample with [A 0] = [D 0] = 1 μM. As Figure 4A shows, the resulting region of optimal values had an elongated shape, indicating that many equally probable solutions exist for {K d , E fr } for that dataset. To show that collecting more data from similar cells does not help, we repeated the experiment, simulating data from 3 cells with the same concentrations as the first. As Figure 4B shows, including this data made the region slightly narrower but did not change its elongated shape.
      Figure 4

      Concentration variations determine shape of region of high probability. Using the Monte Carlo algorithm to analyze data from a single cell containing [D0] = [A0] = 1 μM produced an elongated region of highly probable values (A), indicated by the collection of dots showing the locations in (K d , E fr ) visited by the biased random walk. These single-cell results are shown again in (B-E) (blue dots), with the results from analyzing other data sets superimposed. Analyzing three identical cells instead of one produced a narrower but still elongated region (B). However, when analyzing data from two cells (C and C inset) and three cells (D) with different concentrations of fluorophores, the resulting regions were contracted. Analyzing cells individually showed that the elongated highly probable regions for each all intersected near the true value (E). More extreme variation in concentrations led to an even smaller optimal region (F). In (A-F), there were 10 measurements/cell/channel, 3% added Gaussian noise, and 31,000 steps are shown for each walk. For other parameter values, see Methods.

      The difficulty in determining K d without knowing E fr or vice versa can be overcome by analyzing data from samples containing varying fluorophore concentrations [16, 21]. To quantitatively justify how varying concentrations facilitates parameter fitting, we simulated data from three cells containing different concentrations of donors and acceptors. As Figure 4C shows, analyzing data from two different cells decreased the size of the region of equally probable values. Combining data from a cell containing [D 0] = [A 0] = 1 μM with data from a cell containing 0.5 μM eliminated high values of K d (Figure 4C); combining data from a 1 μM cell and a 5 μM cell eliminated low values of K d (Figure 4C, inset). As Figure 4D indicates, analyzing data from all three cells (0.5 μM, 1 μM, and 5 μM) together eliminated higher and lower K d values, leaving a small region of equally probable values that included the true values. To show why varying fluorophore concentrations is effective, we analyzed data from each of the three cells in Figure 4D individually. The optimal regions for each cell are plotted in Figure 4E and clearly intersect at the true value. Varying the concentrations more broadly can further shrink the solution region (Figure 4F).

      Inference is limited if [D0] = [A0] ≫ K d or [D0] = [A0] ≪ K d
      Plots of complex formation vs. K d (Figure 5A, insets) show that only K d values within a limited range significantly affect complex formation: outside this range, further changes in K d have little effect. For instance, when [D 0] = [A 0] ≈ K d , small changes in K d in either direction affect [DA] (Figure 5A centre, inset). However, when [D 0] = [A 0] ≪ K d (Figure 5A, left inset) or [D 0] = [A 0] ≫ K d (Figure 5A, right inset), only changes to K d in one direction affect the amount of [DA]. When a range of K d values are consistent with the data, K d cannot be fit to a single value.
      Figure 5

      Gaining insight into optimal experimental design. The approximate posterior probability distributions for K d (A) have different shapes if the data analyzed was simulated from three cells containing equal concentrations of donors and acceptors which are much lower than (left), higher than (right) or about equal to K d (centre). For the data analyzed in the left panel, for instance, the cells contained the concentrations [D0] = [A0] = 0.2·10-3μM, [D0] = [A0] = 1·10-3μM, and [D0] = [A0] = 5·10-3μM. Insets show amount of complex formed as a function of K d for the indicated concentrations, demonstrating that where complex formation is insensitive to K d corresponds to plateaus in the posterior probability distributions for K d . In each plot (or inset), a vertical dashed line (or red circle) indicates 10-6M, the true value of K d . (A) had 36,000 steps/walk and 5% added noise. Exploring another aspect of fluorophore concentrations, increasing the ratio [D0] : [A0] increases the uncertainty in fitting K d (B). As the ratio was increased (by keeping [D0] constant for the three cells at 0.2·10-6M, 1.0·10-6M and 5.0 10-6M while decreasing [A0] according to the ratio), posterior probability distributions for K d broadened (true value indicated by dashed vertical line). Insets show data used for fitting (bars marked 'E fr = 0.4') from the donor channel (left) and FRET channel (right) contrasted with data from the same cells simulated with E fr = 0, demonstrating that the relative contribution of FRET decreases as [D0] : [A0] increases. (B) had 50 measurements/cell/channel, 36,000 steps/walk and 3% added noise. Bars show mean ± SD. For other parameters, see Methods.

      By finding the posterior probability distribution of K d , our algorithm predicts when a range of K d values will be consistent with the data. First, we simulated data with equal concentrations of acceptors and donors that were very low or very high relative to K d . As shown in Figure 5A, the approximate posterior probability distributions of K d for data simulated with relatively low or high concentrations form plateaus instead of single peaks. The plateaus indicate that a range of K d values is equally probable and consistent with the data. The posterior distribution obtained for [D 0] = [A 0] ≪ K d reveals that K d could be any number greater than approximately 10-6 M; the distribution for [D 0] = [A 0] ≫ K d reveals that K d < 10-6 M.

      This analysis illustrates how posterior probability distributions for the parameters of interest can be more informative than single values. For example, information from posterior distributions can be useful for improving experimental design. If one were to obtain plateaued distributions like the two shown in Figure 5A, one could recognize that the fluorophore concentrations were too low or high relative to K d and devise a more informative experiment.

      The ratio [D 0]:[A 0] affects inference

      We also used our algorithm to explore the effects of varying the ratio, [D 0] : [A 0], on the algorithm's ability to infer K d (Figure 5B) and how these effects arise. While other authors have reported that FRET data becomes increasingly unreliable as the ratio [D 0] : [A 0] deviates from unity [13], they measured how noise propagates in FRET formulas and did not directly address how the ratio impacts the measurement of interaction strength. As Figure 5B shows, the width of the approximate posterior distributions for K d broaden as the ratio increases, meaning that the uncertainty in the estimate increases. The peaks of the distributions however remain close to the true values. The same effect occurs when the ratio decreases (data not shown).

      The insets in Figure 5 illustrate why the ratio affects the uncertainty, showing data from the donor and FRET channels in the presence and absence of FRET. Here, FRET contributes only a fraction of the observed signals. This small contribution is particularly evident in the FRET channel, where the change in the signal due to FRET is small compared to the measurement error: measurement error is 3% and the increase in the FRET channel from FRET is only 2.9% for [D 0] : [A 0] = 1 and drops to 0.7% for [D 0] : [A 0] = 10 (FRET is included by changing the FRET efficiency from 0 to 0.4). As the ratio increased, the relative shortage of acceptors meant that fewer complexes could form, making the contributions from FRET smaller and causing the uncertainty of the estimate to grow. In the absence of measurement noise, however, our algorithm can estimate K d even from very extreme ratios. These data also demonstrate that with our method, we can make inferences about K d even when FRET causes perceptible changes in only one channel (e.g. the donor channel here).

      Knowing how spectral overlap and the ratio of donors to acceptors affect the inference of protein interaction strength helps make it possible to design informative experiments. For example, we find that in each channel, the difference generated by FRET depends on spectral overlap. In the donor channel, where FRET decreases the signal, the relative change is http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq22_HTML.gif , meaning that the observed change depends on the difference between acceptor bleed-through, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq23_HTML.gif , and donor fluorescence, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq24_HTML.gif . Analogous changes occur in the other channels. When [D 0] = [A 0], the change in the donor channel is proportional to http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq25_HTML.gif , the change in the acceptor channel to http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq26_HTML.gif and the change in the FRET channel to http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq27_HTML.gif . In each channel, increasing the difference between http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq28_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq29_HTML.gif would make experiments more informative.

      Prior information improves estimate

      A further advantage of our method is its flexibility. The Bayesian framework makes it possible to incorporate additional details we know about the system as prior information, allowing us to more accurately represent the system being analyzed and potentially improve our estimate of the parameters of interest. To exploit this feature, we tested whether including additional information about the FRET efficiency would improve our inference of E fr and K d . Information about E fr could be obtained, for example, from lifetime imaging experiments [29].

      Figure 6 shows the approximate posterior probability distributions for E fr and K d that result from analyzing the same data in the presence and absence of prior information for E fr . We included slightly inaccurate prior information with large uncertainty, supposing that E fr was measured to be 0.45 (instead of the true value, E fr = 0.4) with an uncertainty of 0.15. Even with this limited information, the resulting histograms for both K d and E fr have taller, narrower peaks which are closer to the true values compared to the histograms produced by analyzing the same data in the absence of the prior information. As expected, this improvement was more substantial when we analyzed a set of data consisting of just 3 measurements per cell per channel (Figure 6, right column): prior information becomes more important the less relevant information there is in the data. As another form of prior information, we could also include error in the calibration measurements for the constants that relate molecular concentration to fluorescence in Eq (2). Incorporating this information would make it possible to monitor the impact of that error on the uncertainty of the final estimate of K d .
      Figure 6

      Prior information on E fr improves estimate. Approximate posterior distributions for E fr (upper panels) and K d (lower panels) obtained from using the MCMC algorithm to analyze the same datasets in the absence (red) or presence (blue) of prior knowledge about E fr show that prior knowledge helps to improve accuracy and reduce uncertainty of the estimate. True values are indicated by vertical dashed lines and the inset shows the prior distributions used (red is a uniform distribution and blue is a normal distribution centred at 0.45 with a variance of 0.15). The plots on the left show results from analyzing data where 10 measurements were made in each channel for each of 3 cells with [A0] = [D0] = [.2, 1, 5]·10-6M. The plots on the right show results from data from the same cells, but with just 3 measurements/channel. 20,000 steps were recorded for each biased biased random walk, with 5% added noise. For other parameter values, see Methods.

      Summary of testing

      We have shown, using typical, simulated FRET data, that our algorithm accurately recovers the parameters of interest. It responded consistently and intuitively to changes in the amount of measurement noise present in the data and the quantity of data. We also used the posterior probability distributions we obtain for the parameters to gain insight into how the magnitude and variation of the donor and acceptor concentrations affect the ability of the algorithm to infer K d . The algorithm gave informative results even when given data simulated with very high or low concentrations of donors and acceptors relative to the K d although, as expected, with this data it could not converge on a unique pair of values for K d and E fr but gave upper or lower bounds. Gaining an understanding of how experimental parameters such as measurement noise, number of measurements, spectral overlap and fluorophore concentrations affect inference allows experimenters to design optimal experiments that yield information with less uncertainty [24].


      Fluorescence microscopy and FRET open a window onto the cell, allowing us to observe protein interactions as the cell functions as a complete system. However, for protein interaction information from FRET data to be integrated into models and improve our understanding of biological systems, it must be reliably quantified, including the uncertainty in the estimates produced [27].

      For this purpose, we have presented an algorithm for inferring the most probable values of the absolute or relative in vivo dissociation constant and the FRET efficiency from three-cube FRET data. Our algorithm produces estimates in terms of the posterior probability distribution for the parameters of interest. Posterior probability distributions yield more extensive information than a single value can provide, conveying the reliability of the estimates through the shape and width of the distribution and giving upper or lower bounds on parameters if the data is not more informative. Our method requires only basic three-cube FRET data but has the flexibility to use data from more spectral channels and incorporate other types of data, such as a separate measure of the FRET efficiency. We have focused on using our algorithm to fit E fr and K d , but it can also infer the apparent FRET efficiency and ratio of donors to acceptors, other quantities sometimes used as measures of FRET [1519, 14].

      In the examples described here, we make a few assumptions, but these are not necessarily part of our methodology. First, we use molar extinction coefficients that must be measured separately or taken from the literature and assume that the literature measurements are valid in the cellular environment. Second, we assume Gaussian measurement noise, but our model could be straightforwardly adapted to include log-normal or other types of noise. Finally, we have not taken into consideration photo-bleaching, incomplete labelling or dark states, but our model could readily be extended to include these factors.

      We have used simulated data to illustrate principles that apply to real data, showing how, in practice, it is best to infer the dissociation constant from FRET data. Three-cube FRET experiments are a well-established experimental technique [36]. For our method, the calibration procedure we describe in the Methods for quantifying the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif constants of Eq (2) is also necessary. The procedure is similar to existing calibration methods. Measuring relative values of http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq29_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq30_HTML.gif is a standard part of FRET data collection, necessary for calibrating the extent of spectral bleed-through and cross-talk. We require calibration with a donor-acceptor construct to relate the brightness of the donor to that of the acceptor, which is necessary for obtaining the relative K d . This calibration is not always a part of standard FRET data collection, but it has been used in several studies [8, 18, 17, 4, 14, 19, 21]. Alternatively, measuring the absolute brightness per molecule, which is necessary for measuring the absolute K d , is more technically challenging but has been achieved [30, 31].

      We have demonstrated that to infer the values of K d and E fr , it is important that the FRET data analyzed come from samples containing varying concentrations of donor- and acceptor-tagged proteins. The precise amount of variation required is difficult to predict because it depends on the measurement noise present in the data, the number of measurements made, and the extent of spectral bleed-through and cross-talk. Naturally occurring variation in protein expression from cell to cell generated sufficient variability for estimating the FRET efficiency and relative dissociation constant [21], at least in transiently transfected mammalian cells.

      While a number of ad hoc methods exist for quantifying protein interactions using FRET, our method contributes something new in that it makes plain the spectral information being used and the bio-chemical assumptions made about the system. Our model is general: it is not specific to three-cube data and could be used to analyze data with any number of spectral channels. It can be straight-forwardly adapted to other experimental situations, for instance to measure the dimerization affinity of a homodimeric protein. Our approach is a Bayesian analysis that reveals the uncertainty in estimated parameters and produces informative results for data from a wide-range of experimental set-ups. It focuses on determining what we can learn from experimental observations about the parameters of interest given clear assumptions and a systematic, statistical analysis based only on those assumptions.


      Calibration of fluorescence constants, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif

      The http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif constants, which relate concentrations of single fluorophores to observed fluorescence intensity, can be obtained by calibration. Calibration of spectral cross-talk and bleed-through is a standard part of FRET quantification [46]. If the fluorophore concentrations can be determined [30, 31], then the values for the three http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq29_HTML.gif and three http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq30_HTML.gif constants can be obtained from samples containing only donors or only acceptors. This calibration makes it possible to measure the absolute K d . Three-cube data from donor-only and acceptor-only samples correspond to the following equations (from Eq (2)):

      To infer the values of the constants http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq29_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq30_HTML.gif and their uncertainty (due to measurement error), we can sample from posterior distributions for http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq29_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq30_HTML.gif given the calibration data. We will illustrate this procedure for a more complicated example below.

      Determining the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq28_HTML.gif , which relate a complex undergoing FRET to its fluorescence in channel i, is more complicated because the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq28_HTML.gif depend on properties of both the donor and the acceptor. However, from Eq (3), http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq31_HTML.gif , but http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq32_HTML.gif . Therefore, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq33_HTML.gif , and so http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq28_HTML.gif can be obtained through knowledge of the ratio of molar extinction coefficients at the excitation wavelength for channel i. The values of the molar extinction coefficients, ε (D) and ε (A), may not always be available at these wavelengths, but they can be estimated from literature values of the molar extinction coefficients (usually measured at the fluorophore's excitation peak) and the excitation spectra of the donor and acceptor [25]. Interpreting the excitation spectra as the probability of the fluorophore becoming excited and assuming that the extinction coefficient is proportional to the probability of excitation, we can use the excitation spectra to rescale the literature value of the extinction coefficient to the excitation wavelength of channel i. This estimate is valid provided that the molar extinction coefficients obtained from the literature are not significantly different from the molar extinction coefficients in the cellular environment of the experiment (or provided that the change in environment affects both donor and acceptor similarly).

      When the relationship between brightness and concentration cannot be determined absolutely, relative values for http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq29_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq30_HTML.gif can be obtained from samples containing only donors and only acceptors, along with samples containing linked donor-acceptor constructs. The constructs consist of a donor and an acceptor fluorophore separated by a short linker of 5-10 amino acids and have been used for FRET calibration in several studies [8, 18, 17, 4, 14, 19, 21]. The construct's FRET efficiency, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq34_HTML.gif , need not be known and can be determined using the procedure we describe below.

      The three-cube measurements on samples containing only donors or only acceptors would correspond to Eqs (8) and (9). Three-cube data obtained from samples containing donor-acceptor constructs would correspond to the following equations:
      where we have replaced http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq28_HTML.gif by http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq35_HTML.gif . To infer the values of the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif constants given calibration data consisting of three-cube measurements made on samples with only donors, only acceptors, and donor-acceptor complexes, we first define the general likelihood function of these experiments for a single sample:

      where http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq36_HTML.gif , http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq37_HTML.gif , and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq38_HTML.gif are the i th measurements in the donor, acceptor, and FRET channels, respectively. For a given sample, we assume that an equal number of measurements, n, will be made in each channel and σ quantifies the error of the measurements. The predicted intensities in channel k, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq39_HTML.gif , are given by Eqs (8), (9) and (10) and are a function of the constants http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif , the concentrations of the species (donors, acceptors, or donor-acceptor complexes), and also http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq34_HTML.gif in the case of the donor-acceptor construct. The full likelihood for all the calibration data together would be the product of three instances of Eq (11), one specified for each three-cube experiment (donor-only, acceptor-only, or donor-acceptor complex).

      We are interested in the values of the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif but indifferent to the concentrations of fluorophores and complexes in the samples. For this reason and because the concentrations of fluorophores and complexes will vary for different samples, it is useful to marginalize the calibration likelihood over [A 0], [D 0], and [DA]. We can also eliminate σ by marginalization [28], assuming the measurement error is the same for all three spectral channels. If this assumption does not hold, one can define σ k for the measurement error in each channel and either approximate each σ k as equal to the standard deviation of the measurements in the k th channel or include the σ k as parameters to be inferred in the procedure we describe below. Assuming a prior probability that only specifies positive values for the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq34_HTML.gif , marginalizing Eq (11) over the concentrations and σ yields the posterior probability of http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq34_HTML.gif and relative values of http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif :

      which is valid for the three cube experiments for each sample (donor, acceptor, or construct). For the donor-only sample, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq40_HTML.gif ; for the acceptor-only sample, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq41_HTML.gif ; and for the sample with the donor-acceptor construct, the a i are replaced with http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq42_HTML.gif . The posterior probability, including all the calibration data together, is obtained by multiplying the forms of Eq (12) for each of the three 3-cube experiments.

      For a given data set, we can infer http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq34_HTML.gif and five of the six http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif constants relative to the remaining one, for instance http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq43_HTML.gif , which is set to unity. The numerator of Eq (12) remains unaltered by such a rescaling of the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif , and the alteration in the denominator is cancelled by the Jacobian required for the change in variables. One can use a Markov chain Monte Carlo method to sample the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq34_HTML.gif from P calib. Alternatively, one can use a numerical solver to find the most probable values by solving for the roots of the system of equations consisting of the derivatives of Eq (12) with respect to each of the six variables. Using this relative calibration procedure, the final K d values obtained would relate to the true K d by the scaling factor, for instance http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq43_HTML.gif .

      Data simulation

      We designed our simulated data to mimic the key features of experimental data, which could come from various sources, such as fluorescence reader measurements of solutions of purified proteins or images of cells from fluorescence microscopy that have been processed and quantified. To simulate data, we wrote a function in Matlab (The Mathworks, Natick, MA) that takes as input E fr , K d , A 0 , D 0 , n, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif , and r where A 0 , D 0 are vectors of length m (representing m cells or regions within cells), n is the number of measurements made on each cell or vesicle, and r is the strength of measurement noise. The function outputs a set of simulated three-cube FRET data from m samples with n measurements per sample.

      For each pair of concentrations, [A 0] and [D 0], we calculate [D], [A], and [DA] using Eq (5). Next, we calculate the simulated experimental intensities I D , I A , and I F using Eq (1). Finally, we simulate measurement noise by adding Gaussian random numbers to the data. The variable r is a scalar between 0 and 1 that refers to the strength of the measurement noise relative to the mean of the observed signal. Noise with r = 10% would have Gaussian measurement noise with standard deviation that is 10% of the mean signal observed in each channel.

      In our examples, we use E fr = 0.4 and K d = 1 μM. Unless otherwise indicated, we simulate ten measurements per cell per channel and use the following set of http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif values, which were chosen to represent a donor-acceptor pair with considerable spectral cross-talk and bleed-through:

      The constants corresponding to the donor, acceptor, and complex fluorescing in their respective channels http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq44_HTML.gif are assigned the largest values. As donors and acceptors are often not equally bright, we set http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq45_HTML.gif . In the FRET channel, the constants corresponding to bleed-through from donors http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq46_HTML.gif and cross-talk from acceptors http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq47_HTML.gif have been set to approximately 20% of the constants for donor and acceptor in their respective channels http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq48_HTML.gif because spectral overlap contributes significantly in the FRET channel. In the acceptor channel, smaller values are assigned to http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq49_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq50_HTML.gif to describe the donor undergoing cross-talk and, subsequently, either bleed-through ( http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq49_HTML.gif ) or FRET ( http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq50_HTML.gif ) in the acceptor channel. It is unlikely that acceptors would emit photons detectable in the donor channel, so the constants corresponding to that process, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq51_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq52_HTML.gif , are very small, but they are non-zero to show that any measurable spectral contamination can be included.

      Incorporating prior information

      The Bayesian framework makes it possible to incorporate prior information about any of the parameters, including uncertainty in http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif . As an example, we include prior knowledge about E fr (see the Results). Such data could be collected in separate experiments under slightly different experimental conditions and in the presence of measurement noise. We define the prior probability of E fr by a Gaussian distribution with mean at the measured value http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq53_HTML.gif and variance σ E , which reflects the confidence in the measurement:

      Markov chain Monte Carlo (MCMC) estimation

      To sample K d and E fr from the posterior probability distribution, we use a biased random walk. Although a simpler approach would be sufficient to explore two dimensions, we use this method because it will allow us to efficiently extend our search to sample other, additional parameters we may wish to infer, such as the values of http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif .

      We use the Metropolis-Hastings algorithm [32, 33]. It begins at a random location in parameter space and takes random steps, moving in up to 3 dimensions at a time with proposed steps drawn from a distribution that is symmetric about the current location. Proposals that increase the posterior probability are always accepted; those that decrease it are accepted with probability http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq54_HTML.gif , where P (x j ) is the posterior probability of the proposed step and P (x j-1) is the posterior probability of the current location. The walk is run for a sufficiently long time (about 10,000 steps) to generate independent samples from the posterior distribution and the step size chosen to maintain an acceptance rate of 40 - 60% [28].

      Once the walk has converged with the energy fluctuating around a minimum value, we record the steps taken and use the histogram of these sampled values as our estimate of the posterior probability distribution. From this estimated distribution, we obtain the mean and standard deviation of each parameter being inferred. The algorithm is summarized below.

      Marginalization of D 0 and A 0

      We have defined the likelihood in Eq 7 and assume that the measurement noise in each channel is independent. Because we have a Gaussian model, the values of the measurement noise parameters, σ D , σ A and σ F , that maximise the posterior probability are approximately equivalent to the observed variances of the data from the donor, acceptor, and FRET channels, respectively [28]. However, it would also be possible to fit http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq55_HTML.gif , http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq56_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq57_HTML.gif directly, along with K d and E fr .

      Although K d , E fr , D 0 and A 0 are important parameters, we cannot measure them directly. For our purposes, we are interested in the values of K d and perhaps E fr , but not the values of D 0 and A 0. Rather than fit D 0 and A 0, we marginalize or integrate them out:

      To indicate that we have no knowledge about the values of D 0 and A 0, we set the prior, P (D 0, A 0), to a constant for positive D 0 and A 0 (and 0 otherwise).

      As a result,
      Because this expression is difficult to integrate analytically, we consider the 'energy', E = - log(P(data|K d , E fr , D 0, A0)), and approximate E with a two-dimensional Taylor expansion about http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq58_HTML.gif , where http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq58_HTML.gif minimizes E (and thus maximizes the likelihood). Ignoring terms above second order,

      where ∇∇E is the Hessian, or matrix of second-derivatives of E with respect to D 0 and A 0.


      This approximation of the likelihood results in a Gaussian integrand, which we can then integrate analytically [34]. Note that we do not use the limits of integration (0, ∞) which would be appropriate for non-negative concentration values. We instead use the limits (- ∞, ∞) to make the integral simple. It is a valid approximation provided that http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq59_HTML.gif and http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq60_HTML.gif are sufficiently large and it consistently yields appropriate results in practice.

      In summary,

      where |H| is the determinant of the Hessian, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq61_HTML.gif . Although K d and E fr do not appear explicitly in the final form of Eq (18), both http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq62_HTML.gif and H depend on K d and E fr .

      Algorithm summary

      Our algorithm for sampling from the posterior probability distribution of (K d , E fr ) with a prior probability distribution for E fr consists of the following steps:

      1. Perform calibration to obtain values (absolute or relative) for the http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq4_HTML.gif constants.

      2. Define prior probability distributions for parameters to be inferred based on initial information that is known, if any.

      3. Run Markov chain Monte Carlo algorithm:

      Choose initial step http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq63_HTML.gif and compute the posterior probability, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq64_HTML.gif , as outlined in (b) below.

      For j = 2 to n,

      (a) Choose proposal step http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq65_HTML.gif .

      (b) Compute the posterior probability, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq66_HTML.gif , for that step:

      i. Find http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq58_HTML.gif , the values of {D 0, A 0} that minimise the energy of the posterior probability using nonlinear optimisation (the Nelder-Mead simplex algorithm implemented in Matlab's fminsearch function (The Mathworks, Natick, MA)).

      ii. Compute http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq61_HTML.gif using analytical expressions for the second derivatives of E computed in Mathematica (Wolfram Research, Champaign, IL) and exported to Matlab.

      iii. Using Eq (18), compute the likelihood, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq67_HTML.gif .

      iv. Using Eq (13), compute the prior, http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq68_HTML.gif .

      v. Compute the posterior probability distribution, P j , which is the likelihood times the prior probability.

      (c) Check whether to accept the move to http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq69_HTML.gif

      • If P j > P j-1, accept.

      • Otherwise, accept with probability http://static-content.springer.com/image/art%3A10.1186%2F2046-1682-4-10/MediaObjects/13628_2011_12_IEq70_HTML.gif .

      Steps (a)-(c) are repeated until j = n. After an initial burn-in period where the energy reaches a minimum and the walk achieves a stationary distribution, record samples of K d and E fr .

      4. Repeat step 3, varying initial (K d , E fr ). To ensure convergence, the stationary distributions of all walks should overlap [28].


      We have made our data simulation and analysis software available at http://​swainlab.​bio.​ed.​ac.​uk/​software/​FRET.



      We thank Marko Laine (Finnish Meteorological Institute) for providing his MCMC code and helpful feedback. We also wish to acknowledge Stephane Laporte, May Simaan and Jay Nadeau (McGill University) for useful discussions and Andrea Weisse, Bruno Martins, Christos Josephides, Clive Bowsher (University of Edinburgh), Nathan Scales (McGill University), and Vahid Shahrezaei (Imperial College London) for helpful comments on the manuscript. CL and PSS are supported by the Scottish Universities Life Sciences Alliance.

      Authors’ Affiliations

      Department of Physiology, McGill University
      Centre for Systems Biology at Edinburgh, University of Edinburgh


      1. Verveer P, Bastiaens P: Quantitative microscopy and systems biology: seeing the whole picture. Histochemistry and cell biology 2008,130(5):833–843.View Article
      2. Megason S, Fraser S: Imaging in systems biology. Cell 2007,130(5):784–795.View Article
      3. Lakowicz J: Principles of Fluorescence Spectroscopy. New York, NY: Springer; 2006.View Article
      4. Vogel SS, Thaler C, Koushik SV: Fanciful FRET. Sci STKE 2006, 2006:re2.View Article
      5. Piston DW, Kremers GJ: Fluorescent protein FRET: the good, the bad and the ugly. Trends Biochem Sci 2007, 32:407–14.View Article
      6. Hoppe A: Quantitative FRET microscopy of live cells. In Imaging Cellular and Molecular Biological Functions. Edited by: Shorte SL, Frischknecht F. Heidelberg: Springer; 2007:157–180.View Article
      7. Gordon G, Berry G, Liang X, Levine B: Quantitative fluorescence resonance energy transfer measurements using fluorescence microscopy. Biophysical Journal 1998, 74:2702–2713.ADSView Article
      8. Xia Z, Zhou Q, Lin J, Liu Y: Stable SNARE complex prior to evoked synaptic vesicle fusion revealed by fluorescence resonance energy transfer. J Biol Chem 2001, 276:1766–71.
      9. Nagy P, Vámosi G, Bodnár A, Lockett SJ, Szöllösi J: Intensity-based energy transfer measurements in digital imaging microscopy. Eur Biophys J 1998, 27:377–89.View Article
      10. Elangovan M, Wallrabe H, Chen Y, Day RN, Barroso M, Periasamy A: Characterization of one- and two-photon excitation fluorescence resonance energy transfer microscopy. Methods 2003, 29:58–73.View Article
      11. Trón L, Szöllósi J, Damjanovich S, Helliwell SH, Arndt-Jovin DJ, Jovin TM: Flow cytometric measurement of fluorescence resonance energy transfer on cell surfaces. Quantitative evaluation of the transfer efficiency on a cell-by-cell basis. Biophysical Journal 1984, 45:939–46.ADSView Article
      12. Youvan D, Silva C, Bylina E, Coleman W, Dilworth M, Yang M: Calibration of fluorescence resonance energy transfer in microscopy using genetically engineered GFP derivatives on nickel chelating beads. Biotechnology et alia 1997, 3:1–18.View Article
      13. Berney C, Danuser G: FRET or no FRET: a quantitative comparison. Biophysical Journal 2003, 84:3992–4010.ADSView Article
      14. Zal T, Gascoigne NRJ: Photobleaching-corrected FRET efficiency imaging of live cells. Biophysical Journal 2004, 86:3923–39.ADSView Article
      15. Hoppe A, Christensen K, Swanson JA: Fluorescence resonance energy transfer-based stoichiometry in living cells. Biophysical Journal 2002,83(6):3652–64.ADSView Article
      16. Erickson MG, Alseikhan BA, Peterson BZ, Yue DT: Preassociation of calmodulin with voltage-gated Ca(2+) channels revealed by FRET in single living cells. Neuron 2001,31(6):973–85.View Article
      17. Chen H, Puhl H, Koushik S, Vogel S, Ikeda S: Measurement of FRET efficiency and ratio of donor to acceptor concentration in living cells. Biophysical Journal 2006, 91:L39–41.View Article
      18. Elder A, Domin A, Schierle GK, Lindon C, Pines J, Esposito A, Kaminski C: A quantitative protocol for dynamic measurements of protein interactions by FRET-sensitized fluorescence emission. Royal Society Interface 2009, 6:1–24.View Article
      19. Wlodarczyk J, Woehler A, Kobe F, Ponimaskin E, Zeug A, Neher E: Analysis of FRET signals in the presence of free donors and acceptors. Biophysical Journal 2008, 94:986–1000.ADSView Article
      20. Gadella T: FRET and FLIM Techniques. Elsevier Science; 2009.
      21. Chen H, III HP, Ikeda S: Estimating protein-protein interaction affinity in living cells using quantitative Förster resonance energy transfer measurements. Journal of Biomedical Optics 2007, 12:054011.ADSView Article
      22. Bruggeman F, Westerhoff H: The nature of systems biology. TRENDS in Microbiology 2007, 15:45–50.View Article
      23. Piehler J: New methodologies for measuring protein interactions in vivo and in vitro. Curr Opin Struct Biol 2005, 15:4–14.View Article
      24. Kreutz C, Timmer J: Systems biology: experimental design. FEBS Journal 2009,276(4):923–942.View Article
      25. Neher RA, Neher E: Applying spectral fingerprinting to the analysis of FRET images. Microsc Res Tech 2004, 64:185–95.View Article
      26. Ljosa V, Carpenter AE: Introduction to the Quantitative Analysis of Two-Dimensional Fluorescence Microscopy Images for Cell-Based Screening. PLoS Comput Biol 2009,5(12):e1000603.View Article
      27. Wilkinson D: Stochastic modelling for quantitative description of heterogeneous biological systems. Nat Rev Genet 2009,10(2):122–133.View ArticleMathSciNet
      28. MacKay D: Information Theory, Inference & Learning Algorithms. Cambridge: Cambridge University Press; 2002.
      29. Wallrabe H, Periasamy A: Imaging protein molecules using FRET and FLIM microscopy. Current Opinion in Biotechnology 2005, 16:19–27.View Article
      30. Rosenfeld N, Perkins TJ, Alon U, Elowitz MB, Swain PS: A fluctuation method to quantify in vivo fluorescence data. Biophysical Journal 2006,91(2):759–66.ADSView Article
      31. Digman MA, Dalal RB, Horwitz AF, Gratton E: Mapping the number of molecules and brightness in the laser scanning microscope. Biophysical Journal 2008,94(6):2320–2332.ADSView Article
      32. Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E: Equation of state calculations by fast computing machines. J Chem Phys 1953, 21:1087–1092.ADSView Article
      33. Hastings W: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57:97–109.MATHView Article
      34. Sivia D, Skilling J: Data Analysis: a Bayesian Tutorial. Oxford: Oxford University Press; 2006.MATH


      © Lichten and Swain; licensee BioMed Central Ltd. 2011

      This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.