• Keine Ergebnisse gefunden

2. RESULTS AND DISCUSSION

2.4.2. Primary structure characterisation of H1CRD using mass

The recombinant carbohydrate recognition domain of the subunit H1 employed in the present work was expressed in E. coli and purified by affinity chromatography using a galactose-Sepharose affinity column, followed by size exclusion chromatography [169]. The recombinant protein comprises the amino acid residues [148-290] of the H1CRD and an initiator methionine residue. The amino acid residues of the recombinant H1CRD are numbered from 1 to 145 and the same numbering format will be used throughout par. 2.4. (Figure 55).

H 1 C R D ( P 0 7 3 0 6 ) GSERTCCPVN157 WVEHERSCYW167 FSRSGKAWAD177 ADNY CRL E D A1 8 7 R e c o m b i n a n t

P r o t e i n M1GSERTCCPVN110 WVEHERSCYW210 FSRSGKAWAD310 ADNY CRL E D A4 10 H 1 C R D ( P 0 7 3 0 6 ) HLVVVTSWEE197 QKF V QHHI GP2 0 7 VNTWMGLHDQ217 NGPWKWVDGT227 R e c o m b i n a n t

P r o t e i n HLVVVTSWEE510 QKF V QHHI GP6 10 VNTWMGLHDQ710 NGPWKWVDGT810 H 1 C R D ( P 0 7 3 0 6 ) DYETGFKNWR237 PEQPDDWYGH247 GLGGGEDCAH257 FTDDGRWNDD2 6 7 R e c o m b i n a n t

P r o t e i n DYETGFKNWR910 PEQPDDWYGH101 GLGGGEDCAH111 FTDDGRWNDD1 2 1 H1 CRD ( P07 306 ) VCQRPYRWVC277 ETELDKASQE2 8 7 PPLL291

R e c o m b i n a n t

P r o t e i n VCQRPYRWVC131 ETELDKASQE1 4 1 PPLL145

Figure 55: Alignment of the recombinant sequence of H1CRD to the sequence available at the Swiss-Prot protein sequence database with the accession number P07306

The molecular mass of the intact H1CRD was initially determined by ESI-FT-ICR mass spectrometry. For performing FTICR-MS an aliquot of the stock solution of H1CRD was diluted with 50% methanol, 1% acetic acid in MilliQ to a final concentration of 6 pmol/µl and the sample introduced in the electrospray source at a flow rate of 2 µl min-1. The charge of each molecular ion present in the mass spectrum was determined from the 1/z spacing between the isotopes as shown in the insert on the right side of the Figure 56. The monoisotopic theoretical molecular mass of the molecule in the oxidized state (containing 3 disulfide bridges) is 16975.5795 and in the reduced state is 16981.6263. The monoisotopic molecular mass determined from the mass spectrum is 16921.9093, showing a mass shift of 52.5624

to 59.7170 Da depending on the number of cysteine residues existing in reduced and oxidized state under the conditions of the electrospray FT-ICR mass measurement.

∆∆∆∆m=1/11

The only modification of the sequence that would introduce a decrease of the molecular mass approximately within the calculated mass shift interval is a missing glycine residue (57.0215 Da). However, the expression of the protein lacking one amino acid residue within the sequence is unlikely. A more common modification that occurs at the expression of the proteins is the lack of the initiator methionine.

Therefore it can be hypothesised that the initiator methionine might be missing (molecular weight decrease of 131.0405 Da) and one amino acid carries a modification that increases the molecular mass by 72.3313-79.4781 Da. Due to the treatment of the sample wit ß-mercaptoethanol during the extraction of the protein from the cell lysate and during the purification steps [169] and considering the presence of an odd number of cysteine residues within the sequence, the formation of an adduct of H1-CRD with ß-mercaptoethanol was assumed. In Table 8 the monoisotopic molecular mass experimentally determined is compared with the

Figure 56: ESI-FT-ICR mass spectrum of the H1CRD in 50 % methanol, 1 % acetic acid in water at a concentration of 6 pmol/µl. The spectrum shows 8- to 13 times charged molecular ions.

The isotope-resolved mass spectrum of the 11-times charged molecular ion is shown in the upper right panel. The insert on the left side shows the isotopic structure of the ion [M+H]+ after deconvolution of the ESI spectrum.

[M+13H]13+

[M+12H]12+

[M+11H]11+

[M+10H]10+

[M+9H]9+

[M+8H]8+

[M+13H]13+

[M+12H]12+

[M+11H]11+

[M+10H]10+

[M+9H]9+

[M+8H]8+

theoretical monoisotopic masses of the protein according to the number of disulfide bonds and the modifications discussed above.

Table 8: Comparison between the experimental determined molecular weight of the H1CRD and the theoretical mass of the protein according to the proposed modifications

Number of disulfide bonds

Monoisotopic molecular weight [M+H]+

H1CRD -Gly - Met - Met/+ ß-ME Measured

Reduced state 16981.6263 16924.6048 16850.5858 16926.5841

16921.9093 1 S-S bond 16979.6107 16922.5859 16848.5702 16924.5685

2 S-S bonds 16977.5951 16920.5736 16846.5546 16922.5529 3 S-S bonds 16975.5795 16918.5580 16844.5390 16920.5373

The probability that an ion contains more 13C and 15N isotopes increases with the molecular mass therefore the intensity of the monoisotopic ions decreases. Thus the determination of the monoisotopic mass is difficult at higher molecular masses. In order to obtain information concerning the modification that leads to the molecular weight decrease in the ESI-FT-ICR mass spectrum the primary structure was investigated by proteolytic digestion of native H1CRD with trypsin followed by MALDI-FT-ICR analysis of the tryptic fragments. The arginine and lysine residues present in the amino acid sequence of H1-CRD are highlighted in the Figure 57a.

Twelve tryptic fragments could be formed by complete digestion of H1-CRD by trypsin which are designated T1 to T12 (Figure 57b); the cysteine residues are also indicated.

a)

M G S E R T C C P V N W VEH ER SC Y WFSRSGKAW A D A D N Y C R L E D 40 A H L V V V T S W E E Q K F V Q H H I G PVNTWM GLHD QNGPWKWVDG 80 T D Y ET G F K N W RPEQPDDWYG HGLGGGEDC A H F T D D G R W N D 120 DVCQRPYRW V C E T E L D K A S Q EPPLL

b)

Figure 57: (a) Amino acid sequence of the carbohydrate recognition domain of the asialoglycoprotein receptor subunit H1CRD; b) Schematic representation of the trypsin cleavage sites, the tryptic peptide fragments and the location of the cysteine residues.

1 5 17 24 28 37 53 76 88 117 128 137 145

C7C8 C19 C36 C109 C123 C131

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

1 5 17 24 28 37 53 76 88 117 128 137 145

C7C8 C19 C36 C109 C123 C131

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

The protein was digested for 5h at 37° using an enz yme to substrate ratio of 1:50.

The mass spectra of the tryptic digest are shown in Figure 58 and the fragments identified are summarized in Table 9.

Table 9: Molecular mass and amino acid sequence of the peptides obtained in the MALDI-FT-ICR mass spectrum by tryptic digestion of the native H1CRD

[M+H]+exp [M+H]+exp m in

1417.619 1417.6270 6 T8 77WVDGTDYETGFK88

1451.679 1451.6485 21 T10 118WNDDVCQRPYR128 1470.615 1470.6253 7.0 T2 disulfide bridge 6TCCPVNWVEHER17

1883.001 1882.9545 24.6 T5+T6 28AWADADNYCRLEDAHLVVVTSWEEQK53 2303.980 2303.9694 4.6 T5/T11

disulfide bridge

28AWADADNYCR37 129WVCETELDK

2698.298 2698.3096 4.3 T7 54FVQHHIGPVNTWMGLHDQNGPWK76

3287.319 3287.3620 13.0 T9 89NWRPEQPDDWYGHGLGGEDCAHFTDDGR117

The MALDI-FT-ICR mass spectra of the peptide mixture resulted after tryptic digestion of native H1CRD contained the tryptic fragments T3, T5, T7, T8, T9, T10 and T11. The fragments T5 and T11 are linked by a disulfide bridge and thus established between Cys36 and Cys131. The monoisotopic molecular mass ([M+H]+

=2303.980) of the two tryptic fragments T5 and T11 bound through the disulfide

Figure 58: MALDI-FT-ICR mass spectra of the peptide mixture obtained by tryptic digestion of the native H1CRD. The fragments containing a disulfide bridge are indicated in red. The spectra were recorded by adjusting the hexapole parameters to get optimum sensitivity for the mass range of interest: (a) time of flight delay 2500 us, RF amplitude 2.5 and (b) time of flight delay 3500 and RF amplitude 3.5.

T3

bridge was determined with an accuracy of 4.6 ppm. However, the molecular ions of the free peptide fragments T5 and T11 were also identified in the mass spectrum indicating that the disulfide bridge is not stable under the conditions of sample preparation and measurement. Furthermore, the measured mass of T2 ([M+H]+=1470.619) was 2.02 Da lower than the theoretical mass ([M+H]+=1472.6409). The fragment T2 contains two vicinal cysteine residues Cys7 and Cys8, thus showing a disulfide bridge accounting for the 2.02 Da difference.

To confirm the identity of the fragment T2 and the presence of the vicinal disulfide bridge the cysteine residues were reduced with DTT and alkylated with iodoacetamide as described in the Experimental Section. The alkylated H1CRD was subjected to proteolytic digestion with trypsin and the peptide fragments resulted were analysed by MALDI-FT-ICR (Figure 59). The peptides were identified with an average mass error of 4.5 ppm. The fragments T3, T11 and T10 containing one alkylated cysteine residue and the fragment T2 containing 2 alkylated cysteine residues were found at the expected molecular masses (Table 10), thus confirming the vicinal disulfide bridge in the native H1CRD.

T 8

T 2 * *

T 3 *

T 1 1

T 1 0 *

T 1 1 * + T 1 2

T 8

T 2 * *

T 3 *

T 1 1

T 1 0 *

T 1 1 * + T 1 2

Figure 59: MALDI-FT-ICR mass spectrum of the peptide mixture obtained by tryptic digestion of the alkylated H1CRD. The asterisk indicates a carbamidomethyl modification at cysteine.

Table 10: Molecular weights and amino acid sequence of the peptides obtained in the MALDI-FT-ICR mass spectrum by tryptic digestion of the native H1CRD

[M+H]+exp [M+H]+calc. m in ppm

Tryptic fragment Sequence

1005.415 1005.4245 9.4 T3 18SCYWFSR24 1179.529 1179.5351 5.1 T11 129WVCETELDK137 1417.618 1417.6270 6.3 T8 77WVDGTDYETGFK88 1508.671 1508.6699 0.7 T10 118WNDDVCQRPYR128 1586.681 1586.6839 1.8 T2 6TCCPVNWVEHER17

1882.962 1882.9545 3.9 T6 LEDAHLVVVTSWEEQK

2014.988 2014.9790 4.4 T11+T12 129WVCETELDKASQEPPLL145

The identification of the disulfide bond Cys36-Cys131 is in agreement with the X-ray crystal data [166]. A a second linkage described to occur between Cys109 and Cys123 was not detectable under the conditions employed and the tryptic fragments T9 containing the residue Cys109 and T10 containing the residue Cys123 were identified separately in the mass spectrum of the tryptic digest from the native protein.

The third disulfide bridge identified between Cys7 and Cys 8 is not consistent with previous data which are based on the X-ray crystal structure describing a disulfide bridge between Cys8 and Cys19. A possible explanation for the vicinal disulfide bridge identified in the mass spectra of the tryptic digest may be a rearrangement of the disulfide bonds after the reduction followed by renaturation of the H1CRD during the purification steps.

The formation of a disulfide bond between adjacent cysteine residues is a relatively rare structural element which is usually accompanied by the formation of a ß-type turn of the protein backbone. In the last few years, vicinal disulfides have been identified and structurally characterized in a variety of proteins including enzymes, receptors and toxins [170-178]. Recently, non-native vicinal disulfide bonds were observed during the oxidative folding of the 32-residue Amaranthus α-amylase inhibitor [174], and during the synthesis of α-conotoxin [179]. Using the data on Xxx-Cys-Cys-Yyy amino acid sequence of the proteins that contain a vicinal disulfide bridge taken from Brookhaven Protein Data Bank, Hudaky and coworkers [180]

showed that Ser, Thr, Leu, Gly, Glu and Pro, Asp, Asn, Arg are the most frequent amino acid residues in positions Xxx and Yyy respectively. Noteworthy, in the case of the H1CRD, Thr and Pro are present in the positions Xxx and Yyy (Thr-Cys-Cys-Pro).

The structural data obtained from the tryptic digestion of both native and denatured form resulting by reduction and alkylation of the cysteine residues are summarized in the Figure 60.

Figure 60: Schematic representation of the structural information obtained by tryptic digest of the H1-CRD and mass spectrometric determination of the resulting peptide fragments. The tryptic fragments that were identified in the mass spectra are highlighted in grey. The 2 disulfide bonds that were identified are indicated by a connector line between the cysteine residues involved in the formation of the bridge.

The peptides identified in the tryptic digest of the native and alkylated H1CRD cover approximately 95 % of the total sequence. The ions corresponding to the short peptide fragments T1 and T4 were not found in the mass spectra, suggesting that the molecular mass difference observed in the ESI-FT-ICR mass spectrum of the intact native H1CRD might be due to modifications within this domain. The trypsin cleavage at the amino acid 5 confirms that the fragment T1 ends at Arg or Lys. This fragment should contain 5 amino acid residues and could not be observed in the mass spectra of the tryptic mixture. In order to obtain longer amino acid sequences, the reduced and alkylated H1CRD was digested with LysC. The cleavage sites of LysC are indicated in the Figure 61 and the digestion fragments are designated L1 to L6.

The Figure 62 (a) shows the base peak chromatogram obtained from the separation of 5 µg H1CRD fragment mixture produced from the digestion with LysC. Separation was performed using an Agilent 1100 HPLC with the mobile phase consisting of 0.1%

formic acid: 0.1% formic acid in ACN. Gradient separation was used on a 150 mm x 4.6mm x 3µm Discovery RP-18 column at a flow rate of 50 µl/min.

Figure 61: Amino acid sequence of the carbohydrate recognition domain of the asialoglycoprotein receptor subunit H1 (A); Schematic representation of the LysC cleavage sites and the tryptic peptide fragments (B).

1 5 17 24 28 37 53 76 88 117 128 137 145

C7 C8 C19

C36

C109 C123

C131

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

1 5 17 24 28 37 53 76 88 117 128 137 145

C7 C8 C19

C36

C109 C123

C131

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

1 37 53 76 88 137 145

C7C8 C19 C36 C109 C123 C131

L1 L2 L3 L4 L5 L6

1 37 53 76 88 137 145

C7C8 C19 C36 C109 C123 C131

L1 L2 L3 L4 L5 L6

Table 11: Summary of LC-MS data for LysC digested H1CRD. The mass spectra are shown in the figures 61 and 62

LC Rt(min)

Measured m/z

Calculated m/z

m (Da)

Charge Fragment Amino acid sequence 33.5 709.298 709.317 0.02 2+ L4 WVDGTDYETGFK

34.4 854.424 854.461 0.04 1+ L6 ASQEPPLL 427.719 427.734 0.02 2+ L6 ASQEPPLL

34.7 679.724 679.331 0.14 4+ L3a FVQHHIGPVNTWMGLHDQNGPWK 543.933 543.666 0.27 5+ L3a FVQHHIGPVNTWMGLHDQNGPWK 35.6 819.512 819.364 0.15 4+ L1b GSERTCCPVNWVHERSCYWFSRSGK

655.946 655.693 0.25 5+ L1b GSERTCCPVNWVHERSCYWFSRSGK 546.654 546.578 0.07 6+ L1b GSERTCCPVNWVHERSCYWFSRSGK 36.3 675.559 675.332 0.23 4+ L3 FVQHHIGPVNTWMGLHDQNGPWK

540.718 540.467 0.25 5+ L3 FVQHHIGPVNTWMGLHDQNGPWK 38.6 1036.413 1035.817 0.6 3+ L2 AWADADNYCRLEDAHLVVVTSWEEQK

777.423 777.114 0.3 4+ L2 AWADADNYCRLEDAHLVVVTSWEEQK 622.139 621.893 0.25 5+ L2 AWADADNYCRLEDAHLVVVTSWEEQK

a Methionine oxidized

b N-terminal methionine missing

a)

200 400 600 800 1000 1200 m/z

[M+6H]6+

[M+5H]5+

[M+4H]5+

546.654 655.946 819.512 924.529 1100.262

655.946

200 400 600 800 1000 1200 m/z

[M+6H]6+

[M+5H]5+

[M+4H]5+

Figure 62: (a) Base peak chromatogram obtained from LC-MS separation of the peptide mixture produced by proteolytic digestion of H1CRD using LysC; (b) ESI-IT mass spectrum of the peptide characterized by a retention time of 35.6 min. The multiple charged ions correspond to the L1 peptide fragment lacking methionine; (c) MS/MS spectrum of the [M+5H]5+ ion of the peptide eluted after 35.6 min. Fragmentation of the amide bond provides y8, y10, y11 fragment ions which demonstrate the correct identification of the L1 fragment missing the Met-1. residue.

G S E R T C C P V N W V H E R S C Y W F S R S G K

y11 y10 y8

242.125 258.859 301.199 325.222 345.683 363.150 391.032 426.370 471.527 487.198 511.575 534.111 558.692 599.386 610.851 625.656 645.828 665.887 693.203 705.830 718.214 743.211 765.832 783.981 808.824 851.180 862.269 887.778 900.172 919.459 939.256 964.722 990.256 1050.337 1062.829 1114.826 1132.316 1151.259 1170.843

200 300 400 500 600 700 800 900 1000 1100 m/z

y10

242.125 258.859 301.199 325.222 345.683 363.150 391.032 426.370 471.527 487.198 511.575 534.111 558.692 599.386 610.851 625.656 645.828 665.887 693.203 705.830 718.214 743.211 765.832 783.981 808.824 851.180 862.269 887.778 900.172 919.459 939.256 964.722 990.256 1050.337 1062.829 1114.826 1132.316 1151.259 1170.843

200 300 400 500 600 700 800 900 1000 1100 m/z

y10

Figure 63: LC-ESI-IT mass spectra of the peptide fragments eluted at the retention times 38.6 min (a), 36.3 min (b), 33.5 min (c) and 34.4 min (e); LC-ESI-IT MS/MS spectrum of the precursor ion [M+2H]2+ = 709.298 (d); LC-ESI-IT MS/MS spectrum of the precursor ion [M+2H]2+ = 427.719 (f).

709.298

720.277

731.270

400 500 600 700 800 900 1000m/z

[M+2H]2+

300 400 500 600 700 800 900 1000 1100 1200m/z [M+2H]2+

400 500 600 700 800 900 1000 m/z

[M+5H]5+

500 600 700 800 900 100

0

300 400 500 600 700 800 900 1000 1100 1200 m/z y9

400 500 600 700 800 900 1000m/z

[M+2H]2+

300 400 500 600 700 800 900 1000 1100 1200m/z [M+2H]2+

400 500 600 700 800 900 1000 m/z

[M+5H]5+

400 500 600 700 800 900 1000 m/z

[M+5H]5+

500 600 700 800 900 100

0

500 600 700 800 900 100

0

300 400 500 600 700 800 900 1000 1100 1200 m/z y9

300 400 500 600 700 800 900 1000 1100 1200 m/z y9

The mass spectra provided unambiguous identification of the fragments L2, L3, L4 and L6 (Table 11). The identification of the peptide characterised by a retention time of 35.6 min was based on the 4-6-fold charged ions as shown in the Figure 62 B. The measured mass of this peptide matched the calculated mass of the fragment L1 lacking the N-terminal mehionine. The MS/MS spectrum of the 5-fold charged ion confirmed this identification. Removal of the initiator methionine by methionylaminopeptidase was observed to occur if the side chain of the following amino acid is small as is the case for Gly, Ala, Thr, Pro, Ser and Val [181]. Cleavage probability was found the highest for Gly (97%) as is the case of H1CRD.

The lack of the N-terminal methionine decreases the expected monoisotopic mass of the intact H1CRD in the reduced state from 16981.6263 to 16850.5858. The mass identified in the electrospray mass spectrum of the intact H1CRD is 16921.9093.

Based on these data the intact H1CRD lacking the N-terminal methionine measured in the electrospray spectrum contains a modification which accounts for the difference of approximately 72-79 Da which, however was not observed in the detailed analysis of the proteolytic digestion mixtures produced by trypsin and LysC.

However, although the structure of the peaks indicate that the sample is homogeneous, a precise determination of the monoisotopic mass is not possible as previously explained. Moreover, considering that the preferential sites of protonation are the N-terminus of the protein, and the amino acids Lys, Arg and His, there are 21 protonation sites within the sequence and the higher charge state observed in the electrospray mass spectrum is 13. Therefore we can assume that the protein is partially denatured and the number of disulfide bridges cannot be unambiguously identified. Due to the treatment with ß-mercaptoethanol during the purification steps a plausible explanation for this result is the formation of an adduct with ß-mercaptoethanol which could introduce a modification of 76 Da at the seventh cysteine residue and which has been previously described to occur [182].