• Keine Ergebnisse gefunden

4. RESULTS AND DISCUSSION

4.5 Assessing the reproducibility of the improved BBM preparation

4.5.3 Estimation of experimental reproducibility

4.5.3.2 Estimation of experimental reproducibility based on LC-MS signals

4.5.3.2.1 Comparative analysis of a standard peptide mixture

In a first approach, a simple standard peptide mixture of 6 digested proteins Cytochrome C, Lysozyme, Alcohol dehydrogenase, Bovine serum albumin, Serotransferrin and ß-Galactosidase, (the so-called LCP Dionex peptide mixture) was used to tune the data processing scheme and to become acquainted with the data behavior in ideal settings to understand how to judge (dis)similarity in this context. The simplistic nature of this sample makes it significantly easier to monitor the dilution effect and column or buffer variability. In a second step, the key learning from this idealized study was applied to a more complicated dataset, such as the bands 2, 9 and 11 that have been discussed in the preceding section.

The raw data of each LC-MS run was processed according to the parameters described in

§ 3.2.7.3. Filtered signals that were present in 2/3 of the runs analyzed were accepted as common signals. Each of those MS signals was associated with a run-specific RT and accurate mass to allow the pairwise comparison of the processed MS signals between two samples in the form of scatter plots. The linearity and data distribution at the diagonal (that is, the intensity deviation of the common signals from an ideal linear relationship) was expressed using the Spearman correlation. Spearman's rank correlation coefficient is a non-parametric measure of correlation – that is, it assesses how well an arbitrary monotonic function could

describe the relationship between two variables, without making any assumptions about the frequency distribution of the variables. In the case of perfect reproducibility, we expect all the data points to be located in a diagonal, which gives a correlation of 1.0.

Fig. 4.26 shows the scatter plots representative for the comparison of three injection triplicates of a 50 fmol standard peptide mixture.

Figure 4.26: Scatter plot representations of three injection replicates of a 50 fmol Standard Peptide Mixture. Panel A: Scatter plots of all common MS signals found in the three replicates. Panel B: Scatter plots of common MS signals which have been successfully assigned to a MS/MS identification. N states for the number of common signals. The Spearman value reflects the similarity of the signal intensities between the compared samples (ideal case Spearman correlation=1). Each red dot line parallel to the diagonal represents a two fold difference, i.e. across three lines, the fold change is 2*2*2=8. Accordingly, the axes represent an arbitrary mass spectrometric intensity (in counts) in log2 units.

The scatter plots for each compared LC-MS(/MS) pair, based on the total MS signals (Panel A) or the successful MS/MS measurements (Panel B), highly correlated with each other. The clustering of the common features along the diagonal indicates that the precursor MS ions have the same intensity in the two considered samples, a feature to be expected in this experiment where replicates injections of the same sample were analyzed. Further, the Spearman correlation values for both types of analyses were almost identical, strengthening our conviction that a comparison of samples based on the precursor mass intensity should provide comparable results as using the protein identification descriptor.

Spearman: 0.998 Spearman: 0.994 Spearman: 0.983

N = 228 N = 228

Pepmix_50fmol_1_a Pepmix_50fmol_1_b Pepmix_50fmol_1_c

Pepmix_50fmol_1_a

Spearman: 0.977 Spearman: 0.977 Spearman: 0.949

N=228 N = 12894

Injection replicates of a Standard Peptide Mixture

Pepmix_50fmol_1_b Pepmix 50fmol 1 c

N = 12894 N = 12894

B

Pepmix 50 fmol1c Pepmix 50 fmol1a

Pepmix 50 fmol1b Pepmix 50 fmol1c Pepmix 50 fmol1a

Pepmix 50 fmol1b

A

Scatter plots as shown in Fig. 4.26 could also be used to visualize dilution effects. For example, as shown in Fig. 4.27, the precursor mass signal intensity of a 50 fmol standard peptide mixture is compared to the precursor mass signal intensity of a 125 fmol injection of the same peptide mixture.

Figure 4.27: Scatter plot representations of a comparison between a 125 fmol and a 50 fmol injection of a Standard Peptide Mixture. Panel A: Scatter plots of all the common MS signals between the samples. Panel B: Scatter plots of the MS Signals that correspond to successful MS/MS measurements. N is the number of common signals. The Spearman value reflects the similarity of the signal intensities between the compared samples (ideal case Spearman correlation=1). Each red dot line parallel to the diagonal represensts a two fold difference, i.e. across three lines, the fold change is 2*2*2=8. Accordingly, the axes represent an arbitrary mass spectrometric intensity (in counts) in log2 units.

The 109 common precursor mass intensities based on MS/MS identification (Fig. 4.27, panel B) all clearly lied on a diagonal with an offset of approximately 2.5 fold difference from unity, as was expected from the experiment design. A similar pattern could also be observed for the 5662 common precursor masses found in both samples (Fig. 4.27, panel A) with the exception that two ions populations were clearly detected. The first population, comprising the most abundant ions of the analysis, were lying on a diagonal with an offset of approximately 2.5 fold difference similarly to the ions that were identified through a MS/MS identification. This first population was assumed to represent actual peptides that were differentially detected in the analysis. The second ion population, which included mostly low abundant signal, clustered along the diagonal, and could represent the consistent chemical noise that was co-analyzed with the samples.

More subtle effects, such as column ageing, could also be monitored using the same strategy.

The effect of column history on MS signal intensity was put in evidence by following the

A B

Pepmix_125fmol Spearman: 0.937

N = 5662

Pepmix_125fmol Spearman: 0.986

N = 109

A B

Pepmix_125fmol

Pepmix_50fmol

N = 5662

Pepmix_125fmol

Pepmix_50fmol

pattern of a 50 fmol standard peptide mixture analyzed in a new column and then again using the same column and LC buffers after several samples injections (Fig. 4.28).

Figure 4.28: Scatter plot representations of two 50 fmol injections of a Standard Peptide Mixture, injected onto the same column with the interval of several samples. Panel A:

Scatter plots of all the common MS signals between the samples. Panel B: Scatter plots of the MS Signals that correspond to successful MS/MS measurements. N is the number of common signals. The Spearman value reflects the similarity of the signal intensities between the compared samples (ideal case Spearman correlation=1). Each red dot line parallel to the diagonal represents a two fold difference, i.e. across three lines, the fold change is 2*2*2=8.

Accordingly, the axes represent an arbitrary mass spectrometric intensity (in counts) in log2 units.

Not surprisingly, the common signals of the two identical peptide standards did not show the same intensity. MS signals were stronger when the fresh standard peptide mixture (ET_pepmix_50fmol_220807) was analyzed with a new column (for more details see appendix B: table of process variation time record). A Spearman correlation value of 0.74 for the total MS signals, or 0.755 for the MS signals with successful MS/MS identification, were significantly lower that for an ideal case, and demonstrated that column history might contribute to the experimental variability of comparable samples.

The impact of changing column and/or buffers during samples measurement was also investigated. The comparison of standard peptide mixtures measured using two identical columns (identical dimensions, lot number and sample history) showed that common MS signal intensities deviated somehow from the diagonal, also reflected in the lower Spearman correlation value (Fig. 4.29), but without dramatic changes in the data behavior.

ET72_pepmix_50fmol_220807

Figure 4.29: Scatter plot representations of two 50 fmol injection of a Standard Peptide Mixture, injected in two identical columns with the same sample history. Panel A: Scatter plots of all the common MS signals between the samples. Panel B: Scatter plots of the MS signals that correspond to successful MS/MS measurements. N is the number of common signals. The Spearman value reflects the similarity of the signal intensities between the compared samples (ideal case Spearman correlation=1). Each red dot line parallel to the diagonal represents a two fold difference, i.e. across three lines, the fold change is 2*2*2=8.

Accordingly, the axes represent an arbitrary mass spectrometric intensity (in counts) in log2 units.

A similar data distribution was observed when the standard peptide mixture was analyzed on the same column but using different LC solvent batches (results not shown). However, these two effects acted synergistically when the standard peptide mixture was analyzed using different columns and different LC buffer batches (Fig. 4.30), as reflected by the lower Spearman correlation values. Interestingly, the column/buffer effects were consistently more pronounced at the global precursor MS level than at the common signals that were linked to a successful MS/MS measurement. However, the limited number of measurements that were performed using the standard peptide mixture did not allow to differentiate whether this difference in distribution was mostly due to a massive change of background ions distribution compared to the peptide signals, or whether the difference was due to minor mismatching of the precursor masses during sample analysis, leading to increased noise in the corresponding scatter plots.

A B

Pepmix_50fmol

Spearman: 0.832 N = 5662

Spearman: 0.951 N = 109

Pepmix_50fmol

A B

ET69_pepmix_50fmol

ET69_pepmix_50fmol

Figure 4.30: Scatter plot representations of two injections of a 50 fmol Standard Peptide Mixture onto two different columns and using different buffers batches. Panel A: Scatter plots of all the common MS signals between the samples. Panel B: Scatter plots of the MS signals that correspond to successful MS/MS measurements. N is the number of common signals. The Spearman value reflects the similarity of the signal intensities between the compared samples (ideal case Spearman correlation=1). Each red dot line parallel to the diagonal represents a two fold difference, i.e. across three lines, the fold change is 2*2*2=8.

Accordingly, the axes represent an arbitrary mass spectrometric intensity (in counts) in log2 units.

In summary, the analysis of a simple standard peptide mixture using idealized differential conditions of the LC-MS system confirmed that the precursor ion intensity embodies an appropriate descriptor to evaluate the similarity of a sample to another and to pinpoint to common experimental deviations, such as column and LC buffer changes, or dilution effects.

In particular, the behavior of all the ions considered in a pair of samples was very comparable to the ion population that was characterized by tandem mass spectrometry to represent the mass spectrometric signals of peptides commonly shared by the two samples considered.

The quality of this similarity was expressed using the Spearman correlation value, a non-parametric function to evaluate the degree of correlation between two parented ion intensity population to follow an arbitrary monotonic function, here a simple linear function of slope x=1. It is of interest that the Spearman correlation value was calculated here taking into account all the ions considered. However, the graphic representation of the ion distribution in the form of scatter plot clearly separated two populations of ions. The first group mostly clustered at the diagonal independently of the samples being compared and tended to encompass the lower intensity ions. The second group, which included mostly the higher intensity ions, also clustered along a diagonal but with a distinct offset from the first group

ET72_50fmol_pepmix_230807

depending on the type of samples being compared. It was our belief that these two groups represented the solvent and LC contaminants ions and the sample peptide ions, respectively, which should be considered separately in an ulterior version of this similarity measure.