• Keine Ergebnisse gefunden

Identification and quantification of carbohydrate interacting structures in proteins using affinity-mass spectrometry

N/A
N/A
Protected

Academic year: 2022

Aktie "Identification and quantification of carbohydrate interacting structures in proteins using affinity-mass spectrometry"

Copied!
197
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Identification and quantification of carbohydrate interacting structures in proteins using affinity-mass spectrometry

Dissertation submitted for the degree of Doctor of Natural Sciences

Presented by Adrian Moise

at the

Faculty of Mathematics and Natural Sciences Department of Chemistry

Date of the oral examination: December 16th 2014 First referee: Prof. Dr. Dr. h. c. Michael Przybylski

Second referee: Prof. Dr. Wolfram Welte Third referee: Prof. Dr. Jörg Hartig

Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-286315

(2)
(3)

The work reported in this dissertation was completed between October 2007 and November 2012 in the Laboratory of Analytical Chemistry and Biopolymer Structure Analysis, Department of Chemistry of the University of Konstanz, under the supervision of Prof. Dr. Dr. h. c. Michael Przybylski.

I would like to thank:

Prof. Dr. Michael Przybylski for offering me the opportunity to perform my PhD work in his laboratory, for his guidance and kind advice throughout the years;

Prof. Dr. Hans-Joachim Gabius for kindly providing the galectins samples used in this work and for his helpful assistance;

Dr. Sabine André for scientific discussions and helpful suggestions;

Prof. Dr. Wolfram Welte for writing the second evaluation of my dissertation;

Prof. Dr. Jörg Hartig for writing the third evaluation of my dissertation;

Frederike Eggers for the dedicated work during her Master's thesis on galectin-5;

Madalina Maftei, Marilena Manea, Raluca Stefanescu, Alina Petre, Mihaela Stumbaum, Reinhold Weber and Andreas Marquardt for scientific discussions and interesting advice during my work;

All members of the group for the nice and inspiring atmosphere.

(4)

Publications

1. Otto V. I., Damoc E., Cueni L. N., Schurpf T., Frei R., Ali S., Callewaert N., Moise A., Leary J. A., Folkers G., Przybylski M. (2006) N-glycan structures and N-glycosylation sites of mouse soluble intercellular adhesion molecule-1 revealed by MALDI-TOF and FTICR mass spectrometry. Glycobiology. 16, 1033-1044.

2. Perdivara I., Deterding L., Moise A., Tomer K. B., Przybylski M. (2008) Determination of primary structure and microheterogeneity of a beta-amyloid plaque-specific antibody using high-performance LC-tandem mass spectrometry. Anal. Bioanal. Chem. 391, 325-336.

3. Simeonova D. D., Susnea I., Moise A., Schink B., Przybylski M. (2009)

"Unknown genome" proteomics: a new NADP-dependent epimerase/dehydratase revealed by N-terminal sequencing, inverted PCR, and high resolution mass spectrometry. Mol. Cell. Proteomics. 8, 122-131.

4. Moise A., Andre S., Eggers F., Krzeminski M., Przybylski M., Gabius H. J.

(2011) Toward Bioinspired Galectin Mimetics: Identification of Ligand- Contacting Peptides by Proteolytic-Excision Mass Spectrometry. J. Am. Chem.

Soc. 133, 14844-14847.

5. Stefanescu R., Born R., Moise A., Ernst B., Przybylski M. (2011) Epitope structure of the carbohydrate recognition domain of asialoglycoprotein receptor to a monoclonal antibody revealed by high-resolution proteolytic excision mass spectrometry. J. Am. Soc. Mass Spectrom. 22, 148-157.

6. Jimenez-Castells C., Defaus S., Moise A., Przbylski M., Andreu D., Gutierrez- Gallego R. (2012) Surface-based and mass spectrometric approaches to deciphering sugar-protein interactions in a galactose-specific agglutinin. Anal.

Chem. 84, 6515-6520.

7. Kellermeier M., Rosenberg R., Moise A., Anders U., Przybylski M., Colfen H.

(2012) Amino acids form prenucleation clusters: ESI-MS as a fast detection method in comparison to analytical ultracentrifugation. Faraday Discuss. 159, 23-45.

8. Petre B. A., Ulrich M., Stumbaum M., Bernevic B., Moise A., Doring G., Przybylski M. (2012) When is Mass Spectrometry Combined with Affinity Approaches Essential? A Case Study of Tyrosine Nitration in Proteins. J. Am.

Soc. Mass Spectrom. 23, 1831-1840.

9. Slamnoiu S., Vlad C., Stumbaum M., Moise A., Lindner K., Engel N., Vilanova M., Diaz M., Karreman C., Leist M., Ciossek T., Hengerer B., Vilaseca M., Przybylski M. (2014) Identification and Affinity-Quantification of β-Amyloid and α-Synuclein Polypeptides Using On-Line SAW-Biosensor- Mass Spectrometry. J. Am. Soc. Mass Spectrom. 25, 1472-1481.

(5)

Publications in Conference Proceedings

1. Przybylski M., Stefanescu R., Bacher M., Manea M., Moise A., Perdivara I., Marquardt A., Dodel R. C. (2006) Molecular approaches for immuno-therapy and diagnosis of Alzheimer's disease based on epitope-specific anti-beta- amyloid antibodies. Journal of Peptide Science. 12, 99-99.

2. Moise A., Susnea I., Simeonova D., Schink B., Przybylski M. (2008)

"Unknown-genome" proteomics-based identification of a new NADP- epimerase/dehydratase from Desulf. phosphitoxidans by inverted-PCR, Edman-sequencing and high resolution mass spectrometry. Journal of Peptide Science. 14, 190-191.

3. Przybylski M., Moise A., Siebert H. C., Gabius H. J. (2008) CREDEX-MS:

Molecular elucidation of carbohydrate recognition peptides in lectins and related proteins by proteolytic excision-mass spectrometry. Journal of Peptide Science. 14, 40-40.

4. Jimenez-Castells C., Moise A., de la Torre B. G., Przybylski M., Gutierrez- Gallego R., Andreu D. (2009) Analysis of carbohydrate-binding proteins using SPR and Mass Spectrometry. Drugs of the Future. 34, 119-119.

5. Moise A., Siebert H. C., Gabius H. J., Przybylski M. (2009) CREDEX-MS:

proteolytic excision/extraction and affinity-mass spectrometric determination of carbohydrate recognition structures in proteins. New Biotechnology. 25, S19-S19.

Conference Oral and Poster Presentations

1. Moise A., Otto V. I., Damoc E., Cueni L. N., Schurpf T., Frei R., Ali S., Callewaert N., Leary J. A., Folkers G., Przybylski M. (2005) Identification of N-glycosylation sites and characterisation of N glycan structure of soluble intercellular cell adhesion protein (Lec1 sICAM-1) by MALDI-FTICR-MS.

Poster presentation at the 38th DGMS Conference (Rostock, Germany).

2. Moise A., Siebert H.-C., Gabius H.-J., Przybylski M. (2006) Identification of carbohydrate recognition domains in galectin-1 and -3 using proteolytic excision and affinity-mass spectrometry methods. Oral presentation at the 39th DGMS Conference (Mainz, Germany).

3. Moise A., Siebert H.-C., Gabius H.-J., Przybylski M. (2006) Identification of carbohydrate recognition domains in galectins using proteolytic excision and affinity-mass spectrometry methods. Poster presentation at the 17th IMSC (Prague, Czech Republic).

4. Moise A., Paraschiv G., Amstalden E., Marquardt A., Stefanescu R., Manea M., Perdivara I., Juszczyk P., Damoc E., Przybylski M., Vincke C.,

(6)

Muyldermans S. (2007) PAREXPROT: Identification of Antibody-Paratopes by Proteolytic Excision and High-Resolution Mass Spectrometry.

Oral presentation at the 40th DGMS Conference (Bremen, Germany).

5. Moise A., Siebert H.-C., Gabius H.-J., Przybylski M. (2007) CREDEX: Mass spectrometric determination of carbohydrate recognition structures in proteins.

Poster presentation at the 40th DGMS Conference (Bremen, Germany).

6. Moise A., Susnea I., Simeonova D., Schink B., Przybylski M. (2008) Non- genome proteomics based identification of a new protein from Desulfotignum phosphitoxidans bacterium. Poster presentation at the 30th EPS (Helsinki, Finland).

7. Moise A., Siebert H.-C., Gabius H.-J., Przybylski M. (2009) CREDEX-MS:

proteolytic excision/extraction and affinity-mass spectrometric determination of carbohydrate recognition structures in proteins. Poster presentation at the 42nd DGMS Conference (Konstanz, Germany).

8. Moise A., Siebert H.-C., Gabius H.-J., Przybylski M. (2009) CREDEX-MS:

proteolytic excision/extraction and affinity-mass spectrometric determination of carbohydrate recognition structures in proteins. Poster presentation at the 14th European Congress on Biotechnology (Barcelona, Spain).

9. Moise A., Gabius H.-J., Przybylski M. (2010) Affinity-MS approaches for identifying protein-carbohydrate interaction epitopes. Oral presentation at the 1st RSMS Conference (Sinaia, Romania).

10. Moise A., Stefanescu R., Born R., Erns B., Przybylski M. (2010) Epitope structure of H1-CRD to a monoclonal antibody revealed by proteolytic excision mass spectrometry. Poster presentation at the 1st RSMS Conference (Sinaia, Romania).

11. Moise A., Stefanescu R., Born R., Erns B., Przybylski M. (2010) Epitope structure of H1-CRD to a monoclonal antibody revealed by proteolytic excision mass spectrometry. Poster presentation at the 9th EFTMS Conference (Lausanne, Switzerland).

(7)

Table of contents

1 Introduction ...1

1.1 Function and structure of carbohydrate-interacting proteins...1

1.1.1 Structure and properties of galectins ...4

1.2 Analytical approaches for protein-carbohydrate interaction studies ...8

1.3 Mass spectrometric approaches for characterizing biopolymer interactions 12 1.3.1 Soft-ionization mass spectrometry of biopolymers ...12

1.3.2 Mass spectrometric approaches to the analysis of recognition structures in proteins...19

1.3.3 Mass spectrometric methods for elucidating protein-carbohydrate interactions...23

1.4 Scientific goals of the dissertation ...25

2 Results and Discussion ...27

2.1 Proteolytic-excision and -extraction mass spectrometry for identification of carbohydrate recognition sites in proteins ...27

2.1.1 Development of elution systems for dissociating carbohydrate protein/peptide complexes ...31

2.2 Structural characterization of galectins by proteolytic peptide mapping - mass spectrometry ...38

2.3 Characterization of galectins-carbohydrate interactions by affinity- gel electrophoresis...46

2.4 Identification of the carbohydrate binding site in galectins...54

2.4.1 Identification of the carbohydrate binding site in human galectin-1 ....54

2.4.1.1 Influence of galectin-1 alkylation on lactose binding...60

2.4.2 Identification of the carbohydrate binding site in human galectin-3 ....63

2.4.2.1 Identification of blood group oligosaccharides binding site in human galectin-3 ...70

2.4.3 Identification of the carbohydrate binding site in chicken galectin-3 ..74

2.4.3.1 Identification of blood group oligosaccharides binding site in chicken galectin-3...76

2.4.4 Identification of the carbohydrate binding site in human galectin-4 ....78

(8)

2.4.5 Identification of the carbohydrate binding site in rat galectin-5 ... 82

2.4.6 Identification of the carbohydrate binding site in human galectin-8 ... 84

2.4.7 Comparison of X-ray crystallography and mass spectrometry data of galectin-carbohydrate complexes... 89

2.5 Interaction studies of synthetic CRD peptides with carbohydrates... 92

2.5.1 Characterization of interactions of synthetic galectin peptides with carbohydrates by affinity-mass spectrometry ... 92

2.5.2 Determination of dissociation constants for synthetic galectin peptide complexes with lactose... 104

2.6 Identification of the galactose binding site in human alpha-galactosidase A ... 109

2.6.1 Influence of chaperones on galactose binding in alpha-galactosidase A... 115

3 Experimental Part... 119

3.1 Materials and reagents... 119

3.2 Proteolytic and affinity-mass spectrometric methods ... 121

3.2.1 Preparation of immobilized carbohydrate columns ... 121

3.2.2 Proteolytic excision experiments ... 121

3.2.3 Proteolytic extraction experiments... 122

3.2.4 Affinity-MS characterization of synthetic galectin peptides... 123

3.3 Mass spectrometric methods ... 123

3.3.1 Time of flight mass spectrometry... 123

3.3.2 Fourier-transform ion-cyclotron resonance mass spectrometry... 124

3.3.3 Ion trap mass spectrometry ... 125

3.3.3.1 ESI-ion trap mass spectrometry ... 125

3.3.3.2 LC-ESI-ion trap mass spectrometry... 125

3.4 N-terminal sequence determination by Edman degradation ... 126

3.5 Solid phase peptide synthesis... 126

3.6 Chromatographic and electrophoretic separation methods ... 128

3.6.1 Reversed phase high performance liquid chromatography ... 128

3.6.2 Sample concentration and desalting... 129

(9)

3.6.3 One-dimensional gel electrophoresis...129

3.6.4 Colloidal Coomassie staining ...130

3.7 Chemical modification and enzymatic fragmentation of peptides and proteins ...131

3.7.1 Reduction and alkylation of disulfide bonds ...131

3.7.2 Proteolytic digestion of proteins using trypsin ...131

3.7.3 Proteolytic digestion of proteins using clostripain ...132

3.7.4 Proteolytic digestion of proteins using chymotrypsin ...133

3.8 SAW biosensor ...134

4 Summary ...137

5 Zusammenfassung ...141

6 References...145

7 Appendix...165

7.1 Appendix 1: N-terminal sequence determination of galectins...165

7.2 Appendix 2: Summary of proteolytic peptide mapping results for galectins ...170

7.3 Appendix 3: Proteolytic-excision and -extraction of galectin-3...172

7.4 Appendix 4: Affinity-MS of synthetic galectin peptides...174

7.5 Appendix 5: KD determinations for lactose complexes of galectins and synthetic galectin peptides ...183

(10)
(11)

1 Introduction

1.1 Function and structure of carbohydrate-interacting proteins

Glycoproteins and glycosphingolipids are two major classes of glycoconjugates present in mammalian cells. They are formed through covalent attachment of polysaccharides (glycans) to the functional groups of proteins and lipids in an enzymatic process called glycosylation. Glycosylation is one of the most abundant post-translational modifications (PTM), occurring in more than half of the eukaryotic proteins [1]. Glycans are mainly attached to proteins via amide linkages to asparagine side chains (N-glycosylation) or through glycosidic bonds to the side chains of serine and threonine residues (O-glycosylation). Glycosidic linkages are also involved in the attachment of glycans to lipids. The major type of glycolipids in mammalian cells is represented by glycosphingolipids, in which the carbohydrate is bound to sphingosine [2].

Biosynthesis of glycans takes place in the endoplasmic reticulum (ER) and the Golgi apparatus by the action of glycosyltransferases and glycosylhydrolases [3, 4]. O- linked glycans are synthesized step-by-step, while N-glycosylation of protein begins with the addition of a tetratedecasaccharide (Glc3Man9GlcNAc2), which is subsequently trimmed by hydrolases (glucosidases and mannosidases) and finally elongated again by specific glycosyltransferases [3, 5]. The assembly of glycan chains is non-template-driven [6] and the final glycan structures depend on the carbohydrate specificity of the enzymes which are expressed in each organism. An illustrative example of the effect of expression of different glycosyltransferases is the polymorphism of complex carbohydrate structures of glycoproteins and glycolipids expressed at the surface of erythrocytes. This process leads to the different ABO blood group phenotypes, which are of high importance for blood transfusions and organ transplantations.

N- and O-linked glycans are an evolutionary conserved feature of all organisms and serve as markers for cellular recognition [7]. Carbohydrate-protein

(12)

interactions are fundamental for a wide range of physiological and pathophysiological processes including fertilization, immune response, pathogen infection, cell-cell adhesion, cell growth, development, apoptosis and metastasis [8-15]. The importance of these interactions is highlighted by the considerable number of diseases which result from deficient biosynthesis or processing of glycans. For example, inherited defects of lysosomal glycosylhydrolases responsible for the degradation of a wide variety of glycoproteins and glycolipids, lead to pathological levels of substrate accumulation in tissues of patients suffering from lysosomal storage diseases (LSDs) [16, 17].

Glycans are very well suited for their role in molecular recognition, being able to form a vast number of complex, branched structures that surpass all other classes of biopolymers in their coding capacity [18]. In contrast to proteins and nucleic acids, which store information through the number and sequence of amino acids and nucleotides, in carbohydrates information is also encoded by the position and anomeric configuration of the glycosidic linkages. Glycan structures can further differentiate through various modifications such as sulfation, phosphorylation, acetylation and methylation [3]. The high relevance of the information encoded by the glycan part of cellular glycoconjugates attracts increasing research interest for the structural aspects of the interplay between bioactive glycan determinants and their endogenous receptors, i.e. lectins [19].

Lectins (from Latin legere – to select, to choose) [20] are a structurally diverse group of carbohydrate-binding proteins found in a wide variety of organisms, ranging from viruses and plants to humans [21]. Lectins are able to reversibly bind carbohydrates with high specificity and thereby mediate biological recognition events.

They are not enzymes, thus do not cause modifications of bound carbohydrates and are not antibodies [21]. A comparison of equilibrium dissociation constants (KD) for antigen-antibody and lectin-carbohydrate systems shows that carbohydrate-protein interactions are considerably weaker than protein-protein interactions, approximately by a factor of ~103. KD values of lectins binding to simple monosaccharides are usually in the millimolar range (KD ≈ 10-3 M), while binding strengths to complex

(13)

carbohydrates are higher (KD ≈ 10-6 M) [21, 22]. Lectins may have several carbohydrate binding sites per molecule [23] and can form non-covalent intermolecular clusters (oligomeric structures, see Figure 1) which leads to concurrent multiple interactions and thus to high-avidity binding (KD ≈ 10-9 - 10-11 M) [24]. Due to their multivalency, lectins can interact with oligosaccharides on the surface of cells, such as red blood cells, resulting in agglutination [25]. The specificity of lectins, combined with their ability to agglutinate erythrocytes is extensively used in blood differentiation [26, 27].

a) b) a) b)

Figure 1. Lectins can increase their valency by oligomerization. (a) the trivalent monomer of the snowdrop lectin (Galanthus Nivalis Agglutinin, GNA) forms (b) a tetramer with twelve carbohydrate binding sites (Protein Data Bank structure ID: 1MSA). Coloring is done by protein chain. Carbohydrates (α-D-mannose) are represented as stick models.

Many lectins are multi-domain proteins (e.g. animal lectins of C-, I- and P- types), however all lectins contain at least one carbohydrate recognition domain (CRD) responsible for the sugar-binding activity. Lectins may have multiple CRDs with different carbohydrate specificities. A classification of lectins may be based on their monosaccharide specificity: mannose, fucose, galactose/N-acetylgalactosamine, N-acetylglucosamine, and N-acetylneuraminic acid. Lectins are also classified according to their overall tertiary and quaternary structure. The major secondary structure component in most lectins is β-sheet, which can participate in the formation

(14)

of multiple tertiary structures such as β-sandwich, β-trefoil, β-propeller, β-prism I and II [23, 28, 29]. Some examples of tertiary and quaternary structure of lectins are shown in Figure 2.

a) b) c)

d) e) f) a) b) c)

d) e) f)

Figure 2. Structure of different lectins exhibiting various lectin folds (first row) and quaternary structures (second row). Coloring is done according to secondary structure (red for beta sheet, blue for alpha helix, green for random coil). (a) β-prism fold in snowdrop lectin (GNA, PDB entry 1MSA); (b) five-bladed β-propeller fold of Tachylectin-2 (a lectin from the Japanese horseshoe crab, Tachypleus Tridentatus, PDB entry 1TL2); (c) mixed α/β-fold of human Interleukine-8 (PDB entry 1IL8). (d) homodimer of human galectin-1 (PDB entry 1GZW); (e) homotetramer of peanut agglutinin (PNA, PDB entry 1C1W); (f) pentameric structure of human serum amyloid P component (SAP, PDB entry 1LGN).

1.1.1 Structure and properties of galectins

Galectins are a family of galactose-binding lectins found in animals (vertebrates and invertebrates) and some fungi. The first galectin was discovered in 1975 by V. Teichberg [30] in extracts from the electric organ of the eel and in other animal tissues. It was assigned as a low molecular weight (14-16 kDa) hemagglutinin,

(15)

the activity of which can be inhibited by beta-galactosides. Later, several similar lectins (initially called "soluble β-galactoside-binding lectins") were identified in tissues from chicken, bovine, human, rat and other species. Galectins are involved in crucial cellular processes such as cell-cell adhesion, cell migration, cell development and differentiation, chemotaxis and apoptosis [31, 32]. Together with another class of animal lectins, the C-type lectins, they exert various functions within innate immunity mechanisms [33]. Galectins are also involved in tumor metastasis, studies showing that galectin gene expression may be up-regulated by growth factors and oncogenes and down-regulated by tumor suppressors [34]. Galectins are produced in the cytoplasm but are also present in the nucleus and extracellularly [35], in various types of tissues (e.g. muscle, kidney, epithelial, neuronal). Their pattern of expression changes with the developmental and physiological stages of the organism [31].

Galectins are soluble, metal ion independent in their activity and lack glycosylation and signal sequences. The usually do not form disulfide bridges and in most cases their N-terminal amino acids are acetylated [31, 35]. Galectins have at least one CRD with affinity for β-galactosides. The galectin CRDs share certain conserved sequence elements and the β-sandwich fold [31, 36]. Most galectins are multivalent due to oligomerization or to the presence of two CRDs joined by a linker peptide.

Fifteen members of the galectin family have been discovered so far. Galectins 5, 6, 11, 14 and 15 have not been identified in humans. Galectins 5 and 6 are present in rodents, while galectins 11, 14 and 15 were found in sheep.

Based on the protein architecture, galectins may be classified in three groups:

proto, chimera and tandem-repeat [37] (Figure 3). Proto-type galectins (Galectins 1, 2, 5, 7, 10, 11, 13, 14 and 15) can form non-covalent homodimers and thus crosslink identical carbohydrates. There is a single known chimera type galectin, galectin-3 [38]. In addition to the carbohydrate recognition domain, galectin-3 contains an N- terminal collagen-like domain [35] which may serve for binding non-carbohydrate ligands or forming oligomers [39]. Even if the N-terminal domain of galectin-3 is cleaved off, the remaining part, containing the CRD, retains its carbohydrate binding ability [40]. Tandem-repeat galectins (galectins 4, 6, 8, 9 and 12) have two different

(16)

CRDs in a single polypeptide chain which allows them to cross-link different carbohydrates.

Two identical CRDs Gly,Pro,Tyr CRD Two distinct CRDs -rich Domain

Galectins 1, 2, 5, 7, Galectin 3 Galectins 4, 6, 8, 9, 12 10, 11, 13, 14, 15

Proto Chimera Tandem-repeat

Two identical CRDs Gly,Pro,Tyr CRD Two distinct CRDs -rich Domain

Galectins 1, 2, 5, 7, Galectin 3 Galectins 4, 6, 8, 9, 12 10, 11, 13, 14, 15

Proto Chimera Tandem-repeat

Figure 3. Classification of galectins according to their structural features. Galectins 1, 2, 5, 7, 10, 11, 13, 14 and 15 are of the proto type, forming non-covalent homodimers which can crosslink identical carbohydrates. Galectin-3 is the only known chimera type galectin [38], containing a collagen-like domain which may be used to bind non-carbohydrate ligands or aid in oligomerization. Galectins 4, 6, 8, 9, 12 belong to of the tandem-repeat type, having two different CRDs which allow them to cross-link different carbohydrates [31, 41]. Adapted from Barondes, S.H., et al., Galectins. Structure and function of a large family of animal lectins. J. Biol. Chem., 1994. 269(33): p. 20807-10.

Eight residues that interact directly with the carbohydrate ligands are conserved in most mammalian galectins [139, 150, 151] (Figure 4). The importance of these residues for carbohydrate binding was confirmed by studies employing site- directed mutagenesis and X-ray crystallography [42-47]. Studies of galectin- carbohydrate complexes showed a network of hydrogen bonds and van der Waals interactions. Stacking interactions of carbohydrates with aromatic amino acid side chains are especially important for recognition and orientation of the carbohydrate.

The importance of the stacking interactions for binding selectivity may be exemplified by the case of the strongly conserved tryptophan residue in the WGxExR/K motif (W68 in Gal1, W181 in Gal3) which interacts with carbons C3, C4 and C5 on the β face of the galactose ring and allows distinguishing galactose from glucose [48].

Substitutions of W68 in galectin-1 with other amino acids lead to a decrease in binding. Thus, the galectin-1 mutant W68Y showed only 50 % of the activity of wild- type galectin-1 binding to asialofetuin [43]. Galectin-1-W68F displayed a 20 % reduction in binding to lactose (as lactosyl-Sepharose), while the W68L mutant was

(17)

shown to be completely inactive [42]. Although galectins bind mostly to the galactose subunit of carbohydrates through hydrogen bonds from the HxNxR motif and the stacking interactions mentioned above, they also establish interactions with other carbohydrate subunits, which leads to an increase in affinity. As an example, the affinity of rat galectin-1 is 130 fold higher for lactose (a disaccharide composed of glucose and galactose) than for galactose alone [49]. In addition to amino acid residues directly contacting the carbohydrate, a significant role is played by intra-chain electrostatic interactions which enable the orientation of the side chains for binding to carbohydrate hydroxyls [44]. Hydrogen bonds mediated by water molecules also participate in carbohydrate binding [43].

Human Gal1 NLCLHFNPRFNAHGD---ANTIVCNSKD-GGAWG--TEQRE Human Gal2 KLNLHFNPRFS---ESTIVCNSLD-GSNWG--QEQRE Human Gal3 DVAFHFNPRFNENN---RRVIVCNTKL-DNNWG--REERQ Human Gal4N DVAFHFNPRFDG---WDKVVFNTLQ-GGKWG--SEERK Human Gal4C DIALHINPRMG---NGTVVRNSLL-NGSWG--SEEKK Rat Gal5 DIAFHLNPRFD---ENAVVRNTQI-NNSWG--PEERS Mouse Gal6N DVAFHFNPRFDG---WDKVVFNTKQ-SGRWG--KEEEK Mouse Gal6C DIALHINPRI---GDCLVRNSYM-NGSWG--TEERM Human Gal7 DAALHFNPRLD---TSEVVFNSKE-QGSWG--REERG Human Gal8N DVAFHFNPRFKR---AGCIVCNTLI-NEKWG--REEIT Human Gal8C DIALHLNPRLN---IKAFVRNSFL-QESWG--EEERN Human Gal9N DIAFHFNPRFED---GGYVVCNTRQ-NGSWG--PEERK Human Gal9C HIAFHLNPRFD---ENAVVRNTQI-DNSWG--SEERS Human Gal10 DIVFHFQVCF---GRRVVMNSRE-YGAWK-QVESKN Human Gal12N DIAFHFNPRFHT---TKPHVICNTLH-GGRW--QREARW Human Gal12C DQAAHAPVTLR---ASFADRTLAW-ISRWG-QKKL Human Gal13 DIAFRFRVHFG---NHVVMNRRE-FGIWM--LEETT Human Gal14 DIAFQFRLHFG---HPAIMNSRV-FGIWR--YEEKC Human Gal1 NLCLHFNPRFNAHGD---ANTIVCNSKD-GGAWG--TEQRE Human Gal2 KLNLHFNPRFS---ESTIVCNSLD-GSNWG--QEQRE Human Gal3 DVAFHFNPRFNENN---RRVIVCNTKL-DNNWG--REERQ Human Gal4N DVAFHFNPRFDG---WDKVVFNTLQ-GGKWG--SEERK Human Gal4C DIALHINPRMG---NGTVVRNSLL-NGSWG--SEEKK Rat Gal5 DIAFHLNPRFD---ENAVVRNTQI-NNSWG--PEERS Mouse Gal6N DVAFHFNPRFDG---WDKVVFNTKQ-SGRWG--KEEEK Mouse Gal6C DIALHINPRI---GDCLVRNSYM-NGSWG--TEERM Human Gal7 DAALHFNPRLD---TSEVVFNSKE-QGSWG--REERG Human Gal8N DVAFHFNPRFKR---AGCIVCNTLI-NEKWG--REEIT Human Gal8C DIALHLNPRLN---IKAFVRNSFL-QESWG--EEERN Human Gal9N DIAFHFNPRFED---GGYVVCNTRQ-NGSWG--PEERK Human Gal9C HIAFHLNPRFD---ENAVVRNTQI-DNSWG--SEERS Human Gal10 DIVFHFQVCF---GRRVVMNSRE-YGAWK-QVESKN Human Gal12N DIAFHFNPRFHT---TKPHVICNTLH-GGRW--QREARW Human Gal12C DQAAHAPVTLR---ASFADRTLAW-ISRWG-QKKL Human Gal13 DIAFRFRVHFG---NHVVMNRRE-FGIWM--LEETT Human Gal14 DIAFQFRLHFG---HPAIMNSRV-FGIWR--YEEKC

Figure 4. Sequence alignment of carbohydrate binding sites in galectins showing conserved residues.

For galectins containing two CRDs, the N- and C-terminal domains are denoted "N" and "C".

Conserved amino acid residues which interact directly with the carbohydrate are highlighted in blue.

(18)

1.2 Analytical approaches for protein-carbohydrate interaction studies

An essential step preceding the characterization of protein-ligand intermolecular interactions is the primary structure analysis of proteins, which can be performed by mass spectrometry, proteolytic peptide mapping and Edman sequencing [50]. Additional characterization of the secondary and tertiary structure of proteins, especially the determination of disulfide bonds and the identification of post- translational modifications (PTMs), provides important information for the study of protein-ligand complexes. Secondary and tertiary structure analysis of proteins may be performed through various techniques, such as IR spectroscopy [51], circular dichroism spectroscopy (CD) [52, 53], hydrogen-deuterium exchange-mass spectrometry (HDX-MS), tertiary structure-specific chemical modification followed by mass spectrometry [54, 55], crosslinking followed by mass spectrometry, X-ray crystallography and nuclear magnetic resonance (NMR).

After careful characterization of the interacting partners, structural analysis of protein-ligand complexes may be carried out through different methods, each with its advantages and disadvantages in terms of purity and amount of sample required, sensitivity, specificity and speed of analysis [56-58]. These approaches may be divided in: (i) structural methods, e.g. X-ray crystallography [45-47, 59], NMR [13- 15]; (ii) biophysical methods, e.g. isothermal titration calorimetry (ITC) [16-18], surface plasmon resonance (SPR) [60-62], quartz crystal microbalance (QCM), surface acoustic waves (SAW) [22, 23]; (iii) biochemical methods, e.g. enzyme-linked immunosorbent assay (ELISA), enzyme-linked lectin assay (ELLA) [63], inhibition of hemagglutination (HIA) [64].

The binding of carbohydrates to proteins, such as lectins, is a most important type of biological interaction, with important functions in cellular recognition processes, intracellular regulation pathways, and immunological reactions [25-29].

The three-dimensional structures of some carbohydrate-binding proteins, free and in complex with carbohydrates, have been solved by X-ray crystallography or nuclear magnetic resonance spectroscopy (NMR). X-ray crystallography is a powerful

(19)

technique for the structure analysis of proteins and protein-ligand complexes, capable to provide atomic coordinates of an entire molecular assembly in the solid crystalline state [65]. Crystal complexes with glycans have been defined for a number of plant and animal lectins, bacterial toxins, and enzymes that bind carbohydrates [31, 32]. In order to accurately characterize protein-glycan interactions, well-resolved crystal structures (2-2.5 Å) must be obtained. A sufficient resolution is often difficult or even impossible to attain and, in general, obtaining good quality crystals suitable for structure and interaction analysis of protein-glycan complexes is a tedious and time- consuming process. Crystallization of the sample is a critical step which requires large amounts of high purity protein and the optimization of a wide range of conditions (pH, temperature, salts, protein concentration). Large carbohydrates are often difficult to crystallize and too flexible to yield sufficient electron density. A crystal structure may be obtained for the protein in the absence of the ligand and a potential binding site may be predicted using modeling software and comparison with available structures of homologous proteins [33, 34].

Only a small number of proteins and protein-glycan complexes form crystals that diffract well and yield a crystallographic structure [66]. Furthermore, the conformation adopted by both ligand and protein in a crystal might differ from the conformation preferred in solution. To overcome these problems, structural information on proteins in solutions, under physiological conditions, may be obtained by small-angle X-ray scattering (SAXS) and small-angle neutron scattering (SANS) [67, 68]. SAXS and SANS are powerful albeit low-resolution techniques which may be employed to monitor the influence of various experimental conditions on the tertiary and quaternary structures of proteins [67, 69-71]. This aspect is especially important for carbohydrate binding proteins such as lectins which form oligomers.

Both SAXS and SANS may provide complementary information to X-ray crystallography, require overall smaller amounts of sample and may be applied for the analysis of dilute solutions, without restrictive pH and ionic strength constraints.

Another method for investigating tertiary and quaternary structures of proteins and protein complexes in solution is NMR spectroscopy. In NMR, structural

(20)

information may be obtained by use of heteronuclear single quantum coherence (HSQC) and the transferred nuclear Overhauser effect (TRNOE). HSQC involves the transfer of nuclear spin polarization (magnetization) from a proton to a directly- bonded second nucleus (15N or 13C). The magnetization is then transferred back to the proton for detection. The transfer occurs through J-coupling (through-bond dipole- dipole coupling). The 2D 1H-l5N HSQC spectrum of a protein correlates the 1H chemical shift of the proton attached to the backbone amide nitrogen of each amino acid with the 15N chemical shift of the nitrogen. Ligand binding causes perturbations in the peaks' positions and by comparing the free protein and protein-ligand complex spectra the ligand-contacting amino acids may be identified [40, 41]. NOE is a distance-dependent (~r-6 [72]), through-space transfer of nuclear spin polarization between protons in close contact (r < 5 Å [73]). Intramolecular NOEs, which arise in the bound conformation of a ligand, are transferred to the free ligand in solution through chemical exchange. TRNOE measurements allow the determination of proton- proton distances within a bound ligand, thus providing information on the bound-state conformation of the ligand [74-76]. TRNOE measurements are applicable to ligands that are weakly bound (KD > μM) and exchange with the free ligand faster than the cross-relaxation rate. For NOE measurements, like for X-Ray crystallography, the flexibility of the carbohydrates causes problems, e.g. if multiple conformations of a molecule co-exist in equilibrium in solution, the measured NOE intensities will yield time-averaged values of the existing conformations. The large excess of ligand required to maximize the number of exchange events may lead to nonequilibrium conditions and non-specific binding [77]. Intermolecular TRNOE cross-peaks, arising from magnetization transfer between protein protons and bound ligand protons, may be used to for mapping intermolecular contact sites [43, 48]. In addition to the structural information, NMR may also provide equilibrium constants by monitoring the chemical shift variation induced by increasing amounts of ligand (1D 1H NMR titration) [49, 50].

Kinetic and thermodynamic data may also be obtained using ITC, QCM, SPR, and SAW. ITC evaluates the change in free energy resulting from binding of a glycan to a lectin [78]. Its advantages are the ability to monitor interactions without the need

(21)

for immobilization or chemical modification of the binding partners. Drawbacks of ITC include the requirement of high sample amounts (> 10 mg), solubility problems when low-affinity interactions are studied as well as the limited accuracy with which temperature changes may be determined. In ITC, small amounts of glycan are added to the protein solution at regular time intervals and the heat exchanged due to the complex formation is recorded and plotted as power (μcal∙s-1) versus time (s). The data is integrated with respect to time yielding the titration curve, which represents the change in enthalpy (ΔH, Kcal∙mol-1) as a function of molar ration [64]. This curve is then fitted to a theoretical binding model to yield the binding constant (K), binding enthalpy and interaction stoichiometry. The change in free energy and entropy may be determined from the Gibbs equation.

GHTSRTlnK (1)

Biosensor measurements (QCM, SPR, SAW) require the immobilization of one of the interaction partners on the surface of the sensor chip while the other one is passed in solution over the chip surface. The response is recorded as a function of time and the shape of the signal is independent of the biosensor type (Figure 5). The injection of analyte is associated with a signal increase due to the formation of the intermolecular complex (association curve). At the end of the analyte injection the biosensor signal decreases, due to the dissociation of the protein-ligand complex. In SPR the recorded response represents the change in the refractive index of the chip surface, probed with a laser beam, while QCM sensors exploit the piezoelectric effect of a quartz chip to which an alternating current is applied and measures the change in frequency of the quartz crystal resonator as the analyte binds to the immobilized molecule. The SAW method employed in the present work is also based on the piezoelectric effect of a quartz chip that enables the conversion of electronic signals in mechanical acoustic waves (called Love waves) [79]. The wave changes both in phase and amplitude in response to the binding of the analyte.

(22)

Figure 5. Typical sensorgram for several types of biosensors. The association curve (in blue) represents the signal increase due to the binding of the analyte in the mobile phase to the interaction partner immobilized on the biosensor chip. At the end of the injection the analyte flow is replaced by a buffer flow and the signal decreases due to the dissociation of the complex (dissociation curve, in red).

The main drawback of biomolecular interaction analyses is their inability to provide chemical structure information about the interacting biomolecules. A common problem of all biosensor methods employed in the study of protein-carbohydrate interactions arises from the fact that most lectins have multiple binding sites or are active in oligomeric form. Multivalency may lead to an apparent affinity constant that is related to avidity and not to the affinity of a 1:1 intermolecular system.

1.3 Mass spectrometric approaches for characterizing biopolymer interactions

1.3.1 Soft-ionization mass spectrometry of biopolymers

Soft-ionization mass-spectrometry-based methods, e.g. for the elucidation of interactions of antigenic determinants have been developed since the early 1990s by combining immuno-affinity techniques with limited proteolysis, followed by the identification of epitope peptides by mass spectrometry (MS) [53-57]. Peptide identification using mass spectrometry is fast, accurate, requires only minimal amounts of material and, due to the high sensitivity, may be obtained directly out of

(23)

complex mixtures such as cell lysates. In addition to molecular weight determination, mass spectrometry can deliver structural information for different classes of biomolecules. Thus, various fragmentation techniques are able to provide partial or complete amino acid sequences and information regarding the structure and location of post-translational modifications. Protein folding [80] and protein-protein or protein- ligand interactions can also be studied by mass spectrometry. For the analysis of biopolymers, MS techniques such as electrospray ionization (ESI) and matrix assisted laser desorption ionization (MALDI) [81, 82] are mainly employed. ESI and MALDI are considered "soft" ionization methods because they impart little energy to the analyte molecules, which results in minimal or no analyte fragmentation. ESI is gentle enough to preserve non-covalent interactions present in solution [83-85].

ESI is an ionization technique which produces gas-phase ions from solution- phase analytes by creating a spray of charged droplets in an electric field [86]. A potential difference (3-6 kV) is applied between the steel capillary (alternatively a gold- or graphite-coated glass capillary) that delivers the sample and the mass spectrometer inlet (Figure 6). Under the action of the electric field the analyte solution forms the so-called "Taylor cone" at the tip of the emitter [87, 88]. The extent of deviation from the normal meniscus shape caused by surface tension depends on the strength of the applied electric field. The charge density is highest at the apex of the Taylor cone and when it approaches the Rayleigh limit, a fine liquid jet is emitted which breaks off into small droplets with diameters in the micrometer range [83]. The electrical current in the circuit closed by the spray is in the nA range [67, 68] and depends on solvent conductivity and applied potentials. The droplet size may be influenced by altering the viscosity and conductivity of the sample (e.g. the addition of formic or acetic acid will increase the conductivity and decrease the droplet size). The droplet formation may be aided by a gas flow (called nebulizing gas, usually nitrogen) concentric with the capillary delivering the sample. The capillary may be positioned at an angle to the mass spectrometer inlet, which ensures that most of the uncharged molecules (solvent and nebulizing gas) do not enter the instrument. In a common source design, the aerosol is drawn into the first vacuum stage of the instrument through a glass capillary, which is usually heated to facilitate ion formation by

(24)

accelerating solvent evaporation from the charged droplets. Since the solvent evaporates as neutral molecules from the charged droplets the resulting decrease in droplet surface leads to an increase in charge density. When the electrostatic repulsion between the ions within the droplet becomes greater than the surface tension (Rayleigh limit), the droplet undergoes Coulomb fission, splitting into smaller and more stable droplets [89, 90] (Figure 7).

Heated nitrogen

-4 kV Steel capillary

Taylor cone

Glass capillary Heated nitrogen

-4 kV Steel capillary

Taylor cone

Glass capillary

Figure 6. Schematic representation of electrospray in positive-ion mode. A potential difference is applied between the steel capillary and the mass spectrometer inlet. Positive ions in the analyte solution drift towards the fluid meniscus, changing its shape into a cone (Taylor cone) and a fine jet of positively-charged particles is emitted. Solvent evaporation is assisted by a flow of heated nitrogen in the first vacuum stage of the mass spectrometer.

Two theories have been proposed to explain the formation of gas-phase ions:

the Ion Evaporation Model (IEM) and the Charge Residue Model (CRM) (Figure 7).

According to IEM, ions are desorbed directly from the charged droplets, due to the large electric field at the droplet surface. This mechanism is thought to apply mostly to lower molecular weight ions [91, 92]. CRM [73, 74] postulates that consecutive evaporation and fission steps lead to droplets containing a single analyte ion. As the remaining solvent molecules evaporate the charges carried by the droplet are transferred to the analyte. CRM would apply to high molecular weight, globular species. The number of charges depends on the stability of the droplet under the experimental conditions used. For unstructured and partially hydrophobic biopolymers (e.g. denatured proteins), an IEM-like mechanism called the chain ejection model (CEM) has been proposed. It hypothesizes that, due to exposure of hydrophobic

(25)

residues, the protein chain migrates to the droplet surface and is expelled stepwise and highly protonated into the gas phase [75, 76].

M M

M M +

+ + + + +

+ +

+

M

M M M M

+ + + + + + +

+ +

M

+ M+ + +

+ M +

+ +

M+

M M

+ + + +

+ + M

M M

+ + + +

+ +

+

M M M

+ + + +

+ +

+

M+ 2

1

1, 2

CRM

IEM

M M

M M +

+ + + + +

+ +

+

M

M M

M M +

+ + + + +

+ +

+

M

M M M M

+ + + + + + +

+ +

M M M M M

+ + + + + + +

+ +

M

+ M+ + +

+ M +

+ +

M+

+ M+ + M++ + +

+ M + + +

M M +

+ +

M+ +

+ M+ M M+

M M

+ + + +

+ +

M M

+ + + +

+ + M

M M

+ + + +

+ +

+ M

M M

+ + + +

+ +

+

M M M

+ + + +

+ +

+ M

M M

+ + + +

+ +

+

M+ M+ 2

1

1, 2

CRM

IEM

Figure 7. Scheme of the electrospray ionization mechanism in positive-ion mode. 1) Due to the large surface-area to volume ratio of the charged sample droplets, the solvent evaporates fast. The droplets get smaller, while the electric charge density increases. 2) When the electrostatic repulsion between analyte ions exceeds the forces of surface tension (Rayleigh limit) the droplets disintegrate (Coulomb fission) forming smaller droplets. CRM: successive evaporation and fission cycles lead to small droplets containing a single analyte ion. As the remaining solvent molecules evaporate, the charges carried by the droplet are transferred to the analyte. IEM: charged droplets close to the Rayleigh limit stabilize themselves by ejecting analyte ions.

ESI generates mainly protonated ([M+nH+]n+) or deprotonated ([M-nH+]n-) molecules with a number of charges dependent on their molecular size, topography and composition. Since ions are analyzed based on their m/z ratio, the formation of multiply charged ions gives ESI-equipped MS instruments the capability to analyze large biopolymers that would otherwise be outside the mass range of the analyzer. In positive-ion mode the charges come mainly from protons, due to the acidic pH of the solvent and various electrochemical reactions [77, 78], but also from other ions present in the sample, e.g. Na+, K+, [NH4]+. The number of charges that may be carried by peptides and proteins depends in the positive-ion mode by the number of solvent- exposed basic amino acid residues and the unmodified N-terminus. In negative-ion mode the ionization occurs by deprotonation of acidic amino acids and of the free C- terminus. Since the number of solvent-exposed residues varies with the degree of folding, information on the tertiary protein structure can be obtained [76, 79]. ESI-MS employed in the present work uses direct current (DC); however, electrospray ionization is also possible using high-frequency alternating current (AC) [80, 81].

(26)

For MALDI-MS (Figure 8) the analyte is co-crystallized with an excess of non-volatile, UV- or IR- absorbing matrix. MALDI is usually performed under vacuum, but is also possible at atmospheric pressure [93, 94]. Energy uptake upon sample irradiation with short laser pulses (~ns) causes desorption and ionization of the analyte-matrix mixture. The desorption occurs with supersonic speed (~400-2000 m·s-1) and the resulting particle cloud cools by adiabatic expansion [95, 96]. Some instrument designs employing complex ion-transfer optics (e.g. FT-ICR) require further cooling of the ions, which may be achieved by pulsing an inert gas (argon or nitrogen, at ~10 mbar) into the source, in synchronization with the laser hits [97]. For optimal ionization yields in MALDI several laser parameters such as energy, pulse width and frequency must be monitored and adjusted for each measurement, to prevent heating in the bulk of the sample spot and thermal decomposition of the analyte. Since the laser power is fixed, the amount of energy hitting the sample is controlled through attenuation. For this purpose, a variable aperture (iris) and a rotating neutral density filter disk with variable transmittance are incorporated in the laser optics part of the source.

In the gas phase, positive ionization of analyte molecules occurs usually through transfer of protons from the matrix ([M+H+]+) [98], but also by attachment of alkali metal ions (e.g. [M+Na+]+, [M+K+]+) [99, 100]. It was shown that charged analytes in the acidic sample solution prior to crystallization, can retain the charge in the solid phase. A fraction of these preformed ions can survive in the MALDI plume and contribute to the ionization yield [99]. Ions produced by MALDI are often singly charged [101]. Large proteins may carry multiple charges [102, 103], yet not to the extent observed in ESI.

The next step in MALDI-MS is ion extraction, i.e. the acceleration of ions into the mass analyzer by applying a potential difference between the sample holder and the analyzer entrance. The ion extraction may be delayed in order to compensate for the initial energy distribution of the ions and improve resolution (Figure 8). Delayed extraction may be achieved in different ways depending on the type of mass analyzer.

In time of flight (TOF) mass spectrometers, equal potentials are applied to the sample

(27)

holder and the extraction grid during desorption. After a short delay (~ns for linear sources) the potential on the sample plate is increased and the newly formed electric field accelerates the ions. Slower ions, further away from the grid, will receive more energy and catch up with the faster ions, reaching the detector at the same time.

Increased resolution and control are achieved with a dual-stage extraction ion source [104]. In the case of FT-ICR instruments delayed extraction is performed using a multipole placed between the sample holder and the extraction plate. The ions are held in this trap, cooled and then directed into the ion optics by changing the potential on the extraction plate.

Metal sample holder Laser beam

U1 U2

Analyzer Detector Extraction grid Acceleration grid

Metal sample holder Laser beam

U1 U2

Analyzer Detector Analyzer Detector Extraction grid Acceleration grid

Figure 8. Schematic representation of the MALDI process in positive ion-mode in a double-field source. The sample (in blue) is mixed with excess matrix (in yellow) and dried on a stainless steel plate. The laser pulse desorbs and ionizes the matrix, which transfers protons to the analyte. For delayed extraction an extra grid is placed between the target and the grounded acceleration grid. This setup is suitable for time of flight analyzers (TOF). At the moment of desorption, the target and the extraction grid are on equal positive potentials (U1 = U2), therefore the ions are not subjected to any force. After a brief delay (~ns) the potential on the extraction grid is lowered (U2 < U1). Consequently, slower ions (which are further away from the grid are accelerated more than the faster ions (which are closer to the extraction grid). In the end, ions of the same mass-to-charge ratio with initially different kinetic energies reach the detector simultaneously.

The choice of matrix has a great influence on the success of ionization in MALDI-MS. For the analysis of low molecular weight peptides by UV-MALDI, α- cyano-4-hydroxycinnamic acid (CHCA) is usually employed [102]. CHCA forms

(28)

small crystals and yields homogeneous spots. It is considered a "hard" matrix, requiring more energy for ablation, which translates in an increase in analyte fragmentation probability [105]. 3,5-dimethoxy-4-hydroxycinnamic acid (SA, sinapinic acid) is a softer matrix, reducing the risk of analyte fragmentation, hence more suitable for measurements of proteins [106]. 2,5-dihydroxybenzoic acid (DHB, gentisic acid) also allows a softer desorption and may be used for oligosaccharides and peptides [107], alone or mixed with CHCA [108] or other additives [109, 110].

Picolinic acid (PA) and derivatives [100, 101] are suitable for oligonucleotides and DNA. The interpretation of MALDI mass spectra may be complicated in the lower mass range (m/z < 500-800) by the presence of matrix signals. Matrices like CHCA can produce complex clusters with various numbers of sodium and potassium ions, depending on the concentration of salts and analyte in the sample solution [111].

However, these cluster ions are not always a nuisance, as they may be used for internal calibration [112].

Although ESI is a method related to the intact molecular structure, MALDI- MS has some advantages over ESI-MS. First, mass analysis may be synchronized with the ion formation, due to the pulsed nature of the technique. This translates into high sensitivity (sub-femtomole), since nearly all ions produced are also detected.

Furthermore, the formation of singly charged ions leads to fewer signals, thereby facilitating the interpretation of mass spectra of complex mixtures. MALDI-MS has also a higher tolerance for salts, which however depends on the matrix used [110]. A disadvantage of MALDI is the high influence that matrix purity and spot preparation have on the MS results. The homogeneity of the analyte-matrix crystalline lattice as well as the shape, size and orientation of the crystals cause variations in accuracy, resolution and signal intensity, as the laser is moved across the sample surface or from spot to spot [113]. Salts and other contaminants excluded during the crystallization process toward the surface of the crystals affect the type of ions formed. To remove these contaminants and increase sensitivity, spots prepared with matrices with low water solubility (e.g. CHCA) may be briefly washed with cold water without major analyte loss. Depending on the absorbance of the analyte and the laser wavelength, there is the possibility of photochemical fragmentation [114].

(29)

1.3.2 Mass spectrometric approaches to the analysis of recognition structures in proteins

A method gaining increasing attention for studying intra- and inter-molecular interactions combines hydrogen-deuterium exchange of backbone amide hydrogens with mass spectrometry (HDX-MS) [106-109]. HDX is performed by incubating the protein in deuterated solvent, which leads to the exchange of amide hydrogens from the peptide bonds with deuterium atoms. HDX is usually carried out at physiological pH, to keep the proteins in a native conformational state. Since the exchange is strongly pH and temperature dependent, with the lowest rate at pH ~2 and 0 °C, the deuterium label may be preserved by rapidly lowering both parameters for the analyte solution (quenching). The extent and speed of HDX depend on the solvent accessibility of various regions in proteins or protein-ligand complexes, providing information on tertiary and quaternary structures [115, 116]. For example, HDX rates of antigen polypeptides shielded through binding of an antibody will be lower than the rates resulting from the separate analysis of the antigen. By varying the exchange time and the ligand to protein ratio, the affinity and stoichiometry of the protein complex may also be determined [117]. The location and relative amount of incorporated deuterium atoms may be determined by proteolytic peptide mapping at acidic pH (using pepsin), performed after quenching the exchange reaction. To minimize the inevitable back-exchange, both proteolysis and MS must be carried out quickly and at low temperature. The major drawback of HDX-MS is that it requires significant time and effort for sample preparation, management of back-exchange and assignment of exchange rates to single amino acid residues, based on the data obtained for the peptic fragments.

Protein-ligand interactions may also be studied by chemical crosslinking coupled to mass spectrometry (CXMS) [118-124]. CXMS employs homo- or hetero- bifunctional molecules to join spatially close amino acid residues through linkers of defined lengths. Since the crosslinking reactions take place in solution, the native structures of proteins and their complexes are preserved. Crosslinking may also be performed "in vivo", to stabilize transient complexes and examine protein-protein interactions in specific cellular processes [123]. The location of the linkers is then

(30)

identified by mass spectrometry, using a "top-down" or "bottom-up" approach [124].

Information on the three-dimensional structure of the protein is obtained from the bonds formed within the polypeptide chain, while inter-chain bonds in a protein complex will provide information on the interacting structures. Intra-chain crosslinks may also provide information regarding conformational changes induced by ligand binding [111], although the presence of multiple protein conformers will complicate data analysis. The main drawback of this method is the difficulty of data analysis of the many types of chemical structures that form, such as unmodified peptides, mono- linked peptides (only one end of the cross-linker will react with the protein, while the other is deactivated), intra-peptide crosslinks, regular crosslinked peptides and higher order crosslinked peptides [119, 124]. Other drawbacks include the modification of lysine residues by amine-reactive crosslinking reagents, which results in the blocking of tryptic cleavage sites and the possible creation of very large crosslinked peptides, which may be difficult to analyze [122].

Another approach for characterizing the tertiary structure of proteins combines selective chemical modification of amino acid residues with mass spectrometric peptide mapping [54, 55]. This fast and sensitive method has also been employed for the identification of conformational epitopes [125-127]. In this case, chemical modification of both free and antibody-bound antigen molecules is performed. Next, the modified free antigen and the modified antigen separated from the immune complex are proteolytically digested and characterized by mass spectrometric peptide mapping. Subsequent comparison of the antigen modification patterns reveal differences in the relative chemical reactivity of specific amino acid residues, which reflects the surface accessibility of the modified residues in the antigen before and after interacting with the antibody. Due to shielding by the antibody of epitope regions in the antigen, the modification yield of amino acids involved in antigen-antibody interactions is lower for the antibody-bound antigen than for the free antigen. This information allows the identification of the epitope, which comprises the shielded amino acids. This method has the advantage of producing less complex reaction mixtures than chemical crosslinking, making data analysis easier.

(31)

Two molecular chemical methods first described and applied in our laboratory for the identification of protein-protein interaction sites are epitope excision and extraction [54, 116, 117] (Figure 9). The epitope excision method is based on the selective proteolysis in combination with mass spectrometry of e.g. an antigen- antibody complex. First, the antibody is immobilized on a solid support (e.g. agarose, silica beads) and incubated with the antigen in a micro-column. Next, specific enzymatic proteolysis is carried out. The epitope fragments recognized by the antibody are thereby shielded by it against digestion and will remain bound to the immobilized antibody. The other fragments are removed and collected for mass spectrometric analysis. Finally, the epitope-antibody complex is dissociated and the eluted fragments are identified by mass spectrometry. Antibodies in native state are highly resistant to proteases, therefore, depending on the protease employed, the digestion time and the harshness of the elution procedure, the antibody column is typically reusable.

(32)

washing washing

elution

MS Antigen

excision binding

Antibody proteolysis

extraction

Antibody

m/z m/z

m/z

MS MS

Epitope washing washing

elution

MS elution

MS Antigen

excision binding

Antibody proteolysis

extraction

Antibody

excision binding

Antibody

excision binding

Antibody proteolysis

extraction

Antibody proteolysis

extraction

Antibody

m/z m/z

m/z m/z m/z

m/z

MS MS

Epitope

Figure 9. Schematic representation of epitope excision and -extraction procedures exemplified on an antibody-antigen system. In epitope excision the antigen is first incubated with the immobilized antibody and the complex is then digested with various proteases. Unbound fragments are subsequently removed by washing. In the end, affinity-bound epitope peptides are dissociated from the antibody. In epitope extraction the antigen is first digested in solution and the resulting proteolytic fragments are presented to the immobilized antibody. All recovered fractions are analyzed by mass spectrometry and the obtained results are compared.

In the epitope extraction procedure [118-120], the antigen is first proteolytically digested and the resulting peptide mixture applied on the antibody column. Due to the high specificity of the antibody-antigen recognition, only peptides containing the epitope sequences interact with the corresponding paratope and are

Referenzen

ÄHNLICHE DOKUMENTE

Despite the fact that these beetles are able to distinguish even different species of wood degrading fungi, their antennae respond to the general marker compound of fungal

1) Epitope identification of single chain llama anti-Aβ antibodies using mass spectrometry and immunoanalytical methods. Mass spectrometry was successfully used for the

Moreover, high resolution affinity-MS provided the identification of several neo-antigenic protein fragments containing N- and C-terminal, and central domains such

The determination of antibody-binding affinity and specificity of PCS peptides nitrated at different tyrosine residues (Tyr-430, Tyr-421, Tyr-83) and sequence mutations around

A lthough the application of affinity techniques in the analysis of biopolymers by mass spectrom- etry has become an established approach over the last decade or so, it is

knowledge of the genome sequence, allows protein identification by mass spectrometry using the following workflow: (i) hydrolysis of proteins by endoproteinases into peptides, (ii)

If one human leaves the room, then all the remaining humans are equal, by induction.. So let the one human reenter the room, and let another human leave

If one human leaves the room, then all the remaining humans are equal, by induction.. So let the one human reenter the room, and let another human leave