• Keine Ergebnisse gefunden

Structural characterization of galectins by proteolytic peptide mapping -

2 Results and Discussion

2.2 Structural characterization of galectins by proteolytic peptide mapping -

2.2 Structural characterization of galectins by proteolytic peptide mapping - mass spectrometry

As an essential first step for to the identification of carbohydrate binding sites, the primary structures of galectins were characterized by Edman sequencing and mass spectrometric peptide mapping analysis. The digestion yields were verified at several time intervals by one-dimensional gel electrophoresis, to establish optimal conditions for complete proteolysis. All expected tryptic peptide fragments (Appendix, Tables A 1-4), except the very small peptides (1-3 amino acids), were identified by MALDI-MS and LC/MS.

For human galectin-1, N-terminal sequence determination ascertained the absence of the starting methionine and that the first amino acid, alanine, was not acetylated (Appendix, Figure A 1). Since galectin-1 is susceptible to inactivation through structural changes caused by formation of disulfide bridges, all experiments were performed either in presence of dithiothreitol (DTT) (termed native galectin-1) or using the iodoacetamide-alkylated protein (termed alkylated galectin-1). Native galectin-1 was digested in solution with trypsin and the resulting peptide mixture was analyzed by LC/MS/MS (Figure 16) and MALDI-TOF MS (Figure 17).

274.1

300 400 500 600 700 800 900 1000 m/z

260.2277.3

300 400 500 600 700 800 900 1000 m/z

300 400 500 600 700 800 900 1000 m/z

260.2277.3 several charge states of predicted tryptic peptides were identified (sequence coverage 100 %).

1 ACGLVASNLN LKPGECLRVR GEVAPDAKSF VLNLGKDSNN LCLHFNPRFN

51 AHGDANTIVC NSKDGGAWGT EQREAVFPFQ PGSVAEVCIT FDQANLTVKL

101 PDGYEFKFPN RLNLEAINYM AADGDFKIKC VAFD

787.19

500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 2700 m/z

a)

b) 1 ACGLVASNLN LKPGECLRVR GEVAPDAKSF VLNLGKDSNN LCLHFNPRFN

51 AHGDANTIVC NSKDGGAWGT EQREAVFPFQ PGSVAEVCIT FDQANLTVKL

101 PDGYEFKFPN RLNLEAINYM AADGDFKIKC VAFD

787.19

500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 2700 m/z 787.19

500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 2700 m/z 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 2700 m/z 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 2700 m/z

a)

b)

Figure 17. (a) MALDI-TOF MS of native human galectin-1 tryptic digest. All peptides produced monoprotonated ion peaks, usually associated with signals corresponding to sodium adducts ([M+Na+]+). (b) Sequence of galectin-1, in which the identified peptides are highlighted in red.

LC/MS/MS provided full sequence determination and did not show any modifications to the expected protein structure. Using MALDI-TOF MS all predicted tryptic fragments could be detected, except for the small fragments (2-5 amino acids).

Since the digest sample was not desalted before the MS analysis, the MALDI mass spectrum showed sodium adduct peaks ([M+Na+]+) for almost all observed peptides, in addition to the expected protonated peptide signals.

Two versions of human galectin-3 were analyzed: (i), the full-length galectin-3 (hGal-3) and (ii), a truncated version consisting only of the CRD (hGal-3C). hGal-3C was prepared by proteolytic removal of the N-terminal collagenase-sensitive stalk region, to prevent possible oligomerization through the N-terminal domain [155, 156].

The truncated version was shown to have similar carbohydrate binding properties as the intact molecule [40]. The sequence numbering was kept identical as in the intact protein, to facilitate data comparison. For the full length galectin-3, Edman N-terminal sequence determination showed the first amino acid to be methionine, followed by the expected N-terminal sequence (Appendix, Figure A 2). Mass spectrometric analysis by MALDI-FTICR and MALDI-TOF MS of a galectin-3 tryptic digest identified only

peptides originating from the CRD (Figure 18). The N-terminal domain does not contain any arginine or lysine residues. Therefore, using trypsin, a single 129-amino acids peptide is formed. However, this peptide ([M+H+]+calc. = 12302.8) was not observed, probably due to its high hydrophobicity and tendency to aggregate.

LC/MS/MS analysis using a C4 column also did not reveal this peptide.

900 1000 1100 1200 1300 1400 1500 1600 1700 1800 m/z 831.23

1 MADNFSLHDA LSGSGNPNPQ GWPGAWGNQP AGAGGYPGAS YPGAYPGQAP

51 PGAYPGQAPP GAYHGAPGAY PGAPAPGVYP GPPSGPGAYP SSGQPSAPGA

101 YPATGPYGAP AGPLIVPYNL PLPGGVVPRM LITILGTVKP NANRIALDFQ

151 RGNDVAFHFN PRFNENNRRV IVCNTKLDNN WGREERQSVF PFESGKPFKI

201 QVLVEPDHFK VAVNDAHLLQ YNHRVKKLNE ISKLGISGDI DLTSASYTMI a)

b)

900 1000 1100 1200 1300 1400 1500 1600 1700 1800 m/z 831.23

900 1000 1100 1200 1300 1400 1500 1600 1700 1800 m/z 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 m/z 831.23

1 MADNFSLHDA LSGSGNPNPQ GWPGAWGNQP AGAGGYPGAS YPGAYPGQAP

51 PGAYPGQAPP GAYHGAPGAY PGAPAPGVYP GPPSGPGAYP SSGQPSAPGA

101 YPATGPYGAP AGPLIVPYNL PLPGGVVPRM LITILGTVKP NANRIALDFQ

151 RGNDVAFHFN PRFNENNRRV IVCNTKLDNN WGREERQSVF PFESGKPFKI

201 QVLVEPDHFK VAVNDAHLLQ YNHRVKKLNE ISKLGISGDI DLTSASYTMI a)

b)

Figure 18. (a) MALDI-TOF MS of intact human galectin-3 tryptic digest. All assigned peptides were monoprotonated (singly charged) and originated from the CRD. The N-terminal peptide [1-129] (calculated m/z 12302.8) was not observed. (b) Sequence of hGal-3, in which the identified peptides are highlighted in red.

For hGal-3C, N-terminal sequence analysis showed the sequence starting with N118 (Appendix, Figure A 3). MALDI-TOF MS (Figure 19) and LC/Ion trap MS/MS (Figure 20) analysis of the tryptic digest identified all expected fragments including the N- and C-terminal peptides, [119-129] and [234-250]. The signals of small 1-4 amino acids fragments (m/z < 800) could not be well differentiated from matrix signals, but were identified by LC/MS.

[152-162]

101 --- ---NL PLPGGVVPRM LITILGTVKP NANRIALDFQ

151 RGNDVAFHFN PRFNENNRRV IVCNTKLDNN WGREERQSVF PFESGKPFKI

201 QVLVEPDHFK VAVNDAHLLQ YNHRVKKLNE ISKLGISGDI DLTSASYTMI a)

101 --- ---NL PLPGGVVPRM LITILGTVKP NANRIALDFQ

151 RGNDVAFHFN PRFNENNRRV IVCNTKLDNN WGREERQSVF PFESGKPFKI

201 QVLVEPDHFK VAVNDAHLLQ YNHRVKKLNE ISKLGISGDI DLTSASYTMI a)

b)

Figure 19. (a) MALDI-TOF MS of truncated human galectin-3 tryptic digest. All ions produced were singly charged. (b) Sequence of hGal-3C, in which the identified peptides are highlighted in red.

101 --- ---NL PLPGGVVPRM LITILGTVKP NANRIALDFQ

151 RGNDVAFHFN PRFNENNRRV IVCNTKLDNN WGREERQSVF PFESGKPFKI

201 QVLVEPDHFK VAVNDAHLLQ YNHRVKKLNE ISKLGISGDI DLTSASYTMI a)

101 --- ---NL PLPGGVVPRM LITILGTVKP NANRIALDFQ

151 RGNDVAFHFN PRFNENNRRV IVCNTKLDNN WGREERQSVF PFESGKPFKI

201 QVLVEPDHFK VAVNDAHLLQ YNHRVKKLNE ISKLGISGDI DLTSASYTMI a)

b)

Figure 20. (a) LC/Ion trap MS of truncated human galectin-3 tryptic digest. (b) Sequence of hGal-3C, in which the identified peptides are highlighted in red (sequence coverage 86 %).

For the truncated chicken galectin-3 (cGal-3C) Edman analysis provided the N-terminal sequence 95Pro-Tyr-Ser-Glu-Ala (Appendix, Figure A 4). MALDI-MS analysis of the tryptic digest (Figure 21) identified all peptides with the exception of the N- and C-terminal peptides, [95-105] and [225-241]. LC/Ion trap MS/MS of the tryptic digest (Figure 22) allowed the identification of the N-terminal fragment [95-105] while the C-terminal peptide could not be detected. However, this peptide was observed in a further experiment as part of the longer fragment [218-241] containing two missed cleavage sites (Figure 54).

1 MSDGFSLSDA LPAHNPGAPP PQGWNRPPGP GAFPAYPGYP GAYPGAPGPY

51 PGAPGPHHGP PGPYPGGPPG PYPGGPPGPY PGGPPGPYPG GPTAPYSEAP

101 AAPLKVPYDL PLPAGLMPRL LITITGTVNS NPNRFSLDFK RGQDIAFHFN

151 PRFKEDHKRV IVCNSMFQNN WGKEERTAPR FPFEPGTPFK LQVLCEGDHF

201 KVAVNDAHLL QFNFREKKLN EITKLCIAGD ITLTSVLTSM I

750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 m/z 717.51

b) 1 MSDGFSLSDA LPAHNPGAPP PQGWNRPPGP GAFPAYPGYP GAYPGAPGPY

51 PGAPGPHHGP PGPYPGGPPG PYPGGPPGPY PGGPPGPYPG GPTAPYSEAP

101 AAPLKVPYDL PLPAGLMPRL LITITGTVNS NPNRFSLDFK RGQDIAFHFN

151 PRFKEDHKRV IVCNSMFQNN WGKEERTAPR FPFEPGTPFK LQVLCEGDHF

201 KVAVNDAHLL QFNFREKKLN EITKLCIAGD ITLTSVLTSM I

750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 m/z 717.51

750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 m/z 717.51

Figure 21. (a) MALDI-TOF MS of truncated chicken galectin-3 tryptic digest. (b) Sequence of cGal-3C, in which the identified peptides are marked in red. The N-terminal domain, missing in the truncated protein, is marked in grey. The sequence coverage was 74 %, due to unidentified N- and C-terminal peptides.

294.2

1 MSDGFSLSDA LPAHNPGAPP PQGWNRPPGP GAFPAYPGYP GAYPGAPGPY

51 PGAPGPHHGP PGPYPGGPPG PYPGGPPGPY PGGPPGPYPG GPTAPYSEAP

101 AAPLKVPYDL PLPAGLMPRL LITITGTVNS NPNRFSLDFK RGQDIAFHFN

151 PRFKEDHKRV IVCNSMFQNN WGKEERTAPR FPFEPGTPFK LQVLCEGDHF

201 KVAVNDAHLL QFNFREKKLN EITKLCIAGD ITLTSVLTSM I a)

1 MSDGFSLSDA LPAHNPGAPP PQGWNRPPGP GAFPAYPGYP GAYPGAPGPY

51 PGAPGPHHGP PGPYPGGPPG PYPGGPPGPY PGGPPGPYPG GPTAPYSEAP

101 AAPLKVPYDL PLPAGLMPRL LITITGTVNS NPNRFSLDFK RGQDIAFHFN

151 PRFKEDHKRV IVCNSMFQNN WGKEERTAPR FPFEPGTPFK LQVLCEGDHF

201 KVAVNDAHLL QFNFREKKLN EITKLCIAGD ITLTSVLTSM I a)

b)

Figure 22. (a) LC/Ion trap MS of truncated chicken galectin-3 tryptic digest. (b) Sequence of cGal-3C, in which the identified peptides are marked in red (sequence coverage 83 %). The N-terminal domain, missing in the truncated protein, is marked in grey.

Galectin-8 is classified as a tandem-repeat galectin, consisting of two CRDs (the N-terminal Gal-8N and the C-terminal Gal-8C) joined by a linker peptide. Six isoforms are expressed through alternate gene splicing (LGALS8): three are tandem-repeat, with various linker lengths and three are proto-type (consisting of a single CRD) [157]. Three forms of human galectin-8 were analyzed in this work: (i) the short linker isoform of intact galectin-8 which consists of amino acids [1-317] (hGal-8S), (ii) the N-terminal domain comprising amino acids [1-153] (hGal-8N) and (iii) the C-terminal domain encompassing amino acids [175-317] (hGal-8C).

For hGal-8S and hGal-8N, Edman analysis showed the same N-terminal sequence: 1MMLSL. Two different N-terminal sequences were identified in the case of the C-terminal domain (hGal-8C), M175VPKSGT and 176PKSGT. The sequence numbering for hGal-8C was kept as in the intact protein (hGal-8S), to enable an easier comparison of the results. The three Gal-8 isoforms were also digested in solution with trypsin and the resulting peptide mixtures analyzed by LC/MS/MS and MALDI-TOF

MS. All predicted tryptic fragments, except the shortest peptides (1-4 amino acids), were identified in the case of hGal-8N (Figure 24) and hGal-8C (Figure 25). The assigned sequences of hGal-8N and hGal-8C matched the N- and C-terminal domains in hGal-8S. Several tryptic peptides from the linker region of hGal-8S, could not be detected, but the corresponding peptides were later identified by employing clostripain (see Chapter 3.7.3).

1000 1400 1800 2200 2600 3000 3400 3800 m/z

943.77

1 MMLSLNNLQN IIYNPVIPFV GTIPDQLDPG TLIVIRGHVP SDADRFQVDL

51 QNGSSMKPRA DVAFHFNPRF KRAGCIVCNT LINEKWGREE ITYDTPFKRE

101 KSFEIVIMVL KDKFQVAVNG KHTLLYGHRI GPEKIDTLGI YGKVNIHSIG

151 FSFSSDLQST QASSLELTEI SRENVPKSGT PQLRLPFAAR LNTPMGPGRT

201 VVVKGEVNAN AKSFNVDLLA GKSKDIALHL NPRLNIKAFV RNSFLQESWG

251 EEERNITSFP FSPGMYFEMI IYCDVREFKV AVNGVHSLEY KHRFKELSSI

301 DTLEINGDIH LLEVRSW a)

b)

1000 1400 1800 2200 2600 3000 3400 3800 m/z

1000 1400 1800 2200 2600 3000 3400 3800 m/z

943.77

1 MMLSLNNLQN IIYNPVIPFV GTIPDQLDPG TLIVIRGHVP SDADRFQVDL

51 QNGSSMKPRA DVAFHFNPRF KRAGCIVCNT LINEKWGREE ITYDTPFKRE

101 KSFEIVIMVL KDKFQVAVNG KHTLLYGHRI GPEKIDTLGI YGKVNIHSIG

151 FSFSSDLQST QASSLELTEI SRENVPKSGT PQLRLPFAAR LNTPMGPGRT

201 VVVKGEVNAN AKSFNVDLLA GKSKDIALHL NPRLNIKAFV RNSFLQESWG

251 EEERNITSFP FSPGMYFEMI IYCDVREFKV AVNGVHSLEY KHRFKELSSI

301 DTLEINGDIH LLEVRSW a)

b)

Figure 23. (a) MALDI-TOF MS of hGal-8S tryptic digest. All ions produced were singly charged. (b) Sequence of hGal-8S, in which the identified peptides are highlighted in red and cover 74 % of the sequence.

900 1300 1700 2100 2500 2900 3300 3700 m/z

1 MMLSLNNLQN IIYNPVIPFV GTIPDQLDPG TLIVIRGHVP SDADRFQVDL

51 QNGSSMKPRA DVAFHFNPRF KRAGCIVCNT LINEKWGREE ITYDTPFKRE

101 KSFEIVIMVL KDKFQVAVNG KHTLLYGHRI GPEKIDTLGI YGKVNIHSIG

151 FSF a)

b) 900900 13001300 17001700 21002100 25002500 29002900 33003300 37003700 m/zm/z

997.61

1 MMLSLNNLQN IIYNPVIPFV GTIPDQLDPG TLIVIRGHVP SDADRFQVDL

51 QNGSSMKPRA DVAFHFNPRF KRAGCIVCNT LINEKWGREE ITYDTPFKRE

101 KSFEIVIMVL KDKFQVAVNG KHTLLYGHRI GPEKIDTLGI YGKVNIHSIG

151 FSF a)

b)

Figure 24. (a)MALDI-TOF mass spectrum of hGal-8N tryptic digest. All ions produced were singly charged. (b) Sequence of hGal-8N, in which the identified peptides are highlighted in red and cover 93 % of the sequence.

a)

b) 151 --- --- ---MVPKSGT PQLRLPFAAR LNTPMGPGRT

201 VVVKGEVNAN AKSFNVDLLA GKSKDIALHL NPRLNIKAFV RNSFLQESWG

251 EEERNITSFP FSPGMYFEMI IYCDVREFKV AVNGVHSLEY KHRFKELSSI

301 DTLEINGDIH LLEVRSW

900 1300 1700 2100 2500 m/z

674.40

b) 151 --- --- ---MVPKSGT PQLRLPFAAR LNTPMGPGRT

201 VVVKGEVNAN AKSFNVDLLA GKSKDIALHL NPRLNIKAFV RNSFLQESWG

251 EEERNITSFP FSPGMYFEMI IYCDVREFKV AVNGVHSLEY KHRFKELSSI

301 DTLEINGDIH LLEVRSW

900 1300 1700 2100 2500 m/z

900 1300 1700 2100 2500 m/z

674.40

Figure 25. (a) MALDI-TOF mass spectrum of hGal-8C tryptic digest. All ions produced were singly charged. (b) Sequence of hGal-8C, in which the identified peptides are highlighted in red and cover 87 % of the sequence.

2.3 Characterization of galectins-carbohydrate interactions by