• Keine Ergebnisse gefunden

Suppl. Figure 5.1 Alignment of the primary structure of CEACAM3 and CEACAM4 and mRNA expression of CEACAM3 / CEACAM4 in transfected 293 cells. (A) The amino acid sequences of CEACAM3 (UniProt: P40198) and CEACAM4 (UniProt: O75871) were aligned using CLUSTALW.

Identical amino acids are indicated by “.”, whereas “-“ mark spaces introduced to maximize homology. The amino-terminal signal peptide, the extracellular IgV-like domain, the transmembrane domain, and the

121

cytoplasmic part are indicated as shown in the box. Both proteins harbour an intracellular immunoreceptor tyrosine-based activation motif (ITAM)-like sequence (underlined) that in the case of CEACAM4 closely resembles the indicated ITAM-consensus sequence. (B) The predicted beta-strands (A-G) of the extracellular IgV-like domain (according to the crystal structure of murine CEACAM1; Tan K. et al. (2002) EMBO J. 21:

2076-86) are marked. Amino acid differences between CEACAM3 and CEACAM4 are highlighted by yellow color. (C) 293 cells were transfected with the empty pLPS-3’-EGFP vector or the same plasmid encoding CEACAM3 or CEACAM4, respectively. Total RNA was isolated and mRNA was reverse transcribed using OligodT primers. Quantitative Real-time PCR using Tacman probes specific for either CEACAM3 or CEACAM4 gene sequences was performed to demonstrate the specificity of the used primer sets. Indeed, CEACAM3-specific primers only yielded a PCR fragment in CEACAM3-transfected, but not CEACAM4-transfected cells, whereas CEACAM4-specific primers did not amplify a PCR product in CEACAM3-transfected 293 cells.

Suppl. Figure 5.2 CEACAM3 and CEACAM4 protein expression in transfected 293 cells. Opa protein expression by Neisseria gonorrhoeae. (A) 293 cells were transfected with vectors encoding GFP, CEACAM3-GFP, or CEACAM4-GFP, respectively. and two days later, whole cell lysates (WCLs) were prepared. WCLs were separated by SDS-PAGE and CEACAM expression was confirmed by western blotting with a monoclonal anti-GFP antibody. (B) Opa protein expression by the bacteria was verified by western blotting using a monoclonal anti-Opa antibody with lysates derived from N. gonorrhoeae (Ngo), expressing OpaCEA protein (Ngo OpaCEA) or lacking Opa protein expression (Ngo Opa-).

122

Suppl. Figure 5.3 Expression of CEACAM3/4 chimeric proteins and v-Src-mediated tyrosine phosphorylation of CEACAM4. (A) 293 cells were transfected with pcDNA or pLPS3’mKate encoding CEACAM3, CEACAM3/4 WT, CEACAM3/4 ∆CT, and CEACAM3/4 Y222/233F or left untransfected.

Two days later, cells were analyzed by flow cytometry (10000 cells per sample) for CEACAM-mKate expression. Values represent the percentage of mKate-positive cells. In all cases, transfection efficiency of the mKate fusion proteins was higher than 50 %. (B) Cells were transfected as in (A). An additional sample was transfected with an mKate encoding plasmid. Whole cell lysates (WCL) of the transfected cells were prepared and analyzed for mKate expression by western blotting with anti-mKate antibodies. Multiple faster migrating bands appear in the sample containing CEACAM3/4 ∆CT, which are recognized by the mKate antibody. These bands presumably represent immature (un- or hypo-glycosylated) forms of the protein, which do not seem to occur for the wildtype chimera or the ITAM-mutant, indicating a role for the cytoplasmic domain of CEACAM4 in passage through secretory compartments. (C) 293 cells were transfected with vectors encoding mKate or mKate-tagged CEACAM3, CEACAM3/4, CEACAM3/4 Y222/233F, or CEACAM3/4 ∆CT, respectively. Two days later, cells were infected with OpaCEA protein-expressing N. gonorrhoeae (MOI 30) for 1 hour and total cell-associated bacteria were measured by bacterial binding assays. Bars represent mean ± standard deviations (n=3) of total cell-associated bacteria. (D) 293 cells were transfected with vectors encoding GFP or CEACAM4-GFP and cotransfected as indicated with v-Src. WCL were analyzed for expression of GFP or CEACAM4-GFP by western blotting using a monoclonal anti-GFP antibody (upper panel). The same samples were probed with an antibody against phosphotyrosine (anti-pY72; lower panel). The results demonstrate that CEACAM4 is strongly tyrosine phosphorylated in the presence of v-Src. In contrast, tyrosine phosphorylation of CEACAM4 is not detectable under these conditions in the absence of v-Src

123

5.7 Acknowledgments

We thank T.F. Meyer (MPI for Infection Biology, Berlin, Germany) for the Neisseria strains used in this study, W. Zimmermann (Ludwig-Maximilians-Universität, München, Germany) for providing CEACAM cDNA, R. Frank (Leibniz-Institut für Molekulare Pharmakologie, Berlin, Germany) for synthesis of peptide spot membranes, and R.

Hohenberger-Bregger and S. Feindler-Boeckh for expert technical assistance. This study was supported by funds from the DFG (Ha2568/6-2) to CRH and the Swiss National Science Foundation (31003A_143739) to MPT. JGDT acknowledges support by the DAAD (Deutscher Akademischer Austauschdienst).

124

6 Chapter IV

Finding bacterial ligand of the orphan receptor CEACAM4

Julia Delgado Tascón1, Tancred Frickey2,and Christof R. Hauck1,3

1 Lehrstuhl für Zellbiologie, Universität Konstanz, 78457 Konstanz, Germany

2 Lehrstuhl für angewandte Bioinformatik, Universität Konstanz, 78457 Konstanz, Germany

3 Konstanz Research School Chemical Biology, Universität Konstanz, 78457 Konstanz, Germany

In preparation

125

6.1 Abstract

The human CEACAM family comprises a particular receptor, CEACAM4, for which no ligand is currently known. Due to the absence of the CEACAM4 gene in rodents and the lack of a bona fide ligand, the function of this CEACAM remains to be elusive. CEACAM family members are mostly expressed on epithelial or endothelial cells, where they can be exploited by different pathogenic bacteria to successfully colonize the human body.

However, CEACAMs can also be expressed in the myeloid lineage and help innate immune cells to phagocytose and eliminate invading bacteria. Interestingly, CEACAMs are not only able to bind pathogenic bacteria, but also to commensal bacteria including Neisseria lactamica and, recently, Prevotella species. The fact that pathogenic CEACAM-binding bacteria so far have not be able to bind CEACAM4, open the question whether CEACAM4 could instead interact with commensal bacteria. This study aimed to screen for a CEACAM4 bacterial ligand within the human intestinal flora. For this purpose, human stool samples from 10 healthy volunteers were pooled. A soluble GFP fusion protein of the N-terminal domain of CEACAM4 was utilized to capture and enrich putative CEACAM4-binding bacteria, which were subsequently isolated by affinity isolation. From the isolated bacterial population, next generation 16S rRNA sequencing was conducted followed by cluster analysis of sequences (CLANS) and SILVA database searches for identification of potential CEACAM4 binding bacteria. Bioinformatic analyses identified two phyla (Firmicutes and Bacteroidetes) with a meaningful enrichment in the CEACAM4-associated bacteria. Therefore, isolated species of Prevotella, Clostridiales and Selenomonadales were tested in CEACAM4-pull downs. However, a specific binding of commensal bacteria to CEACAM4 could not be confirmed biochemically. Therefore, improvements of the used setup are suggested to be implemented for futures efforts in the search for a ligand of the orphan receptor CEACAM4.

6.2 Introduction

The CEACAM gene-encoding proteins are widely expressed and display a big diversity between the different mammalian lineages. Genes of the human CEACAM family are encoded on the chromosome 19 and belong to the 139 fastest evolving human genes most likely due to a species-specific coevolution between hosts and bacteria. (Kammerer and Zimmermann, 2010; Zid and Drouin, 2013). The human CEACAM family Members are

126

prominently expressed in epithelial, endothelial cells and leukocytes, therefore, they have a wide range of biological functions including general cellular processes such as cell adhesion, differentiation, proliferation, survival as well as mediating innate immune responses (Kuespert et al., 2006). CEACAMs can be exploited for bacterial pathogens to enter and successfully colonize mucosal surfaces of the human body (Barnich et al., 2007;

Chen and Gotschlich, 1996; Gray-Owen et al., 1997a; Hill et al., 2001; Hill and Virji, 2003; Sauter et al., 1993; Schmitter et al., 2004; Tchoupa et al., 2014; Virji et al., 1996).

However, similar to pattern recognition receptors (PRRs) like Toll-like receptors (TLRs), CEACAMs expressed on leukocytes can recognize invading microorganisms and initiate an opsonin independent immune response to combat infection. Apart from human-specific bacterial pathogens like Neisseria gonorrhoeae, Neisseria meningitidis, Haemophilus influenzae, Moraxella catarrhalis as well as some strains of pathogenic Escherichia coli, CEACAMs are also able to recognize non-pathogens microbes like Neisseria lactamica which provides a protective mechanism against Neisseria meningitidis (Evans et al., 2011;

Toleman et al., 2001). Moreover, a recent study has reported that human CEACAMs can also bind commensal bacteria like Prevotella species (Roth, 2013).

In this line, it is proposed that PRRs may have evolved to mediate the bidirectional crosstalk between commensals and their hosts to maintain beneficial, symbiotic coexistence with the microbiota (Chu and Mazmanian, 2013). It is known that humans contains over 10 times more microbial cells than human cells, so our body is host to a wide variety of microbial communities (called microbiota or commensal microflora) important for the normal development of the immune system (Qin et al., 2010). Remarkably, bacterial receptors not only recognize pathogens, but rather (besides virulence factors) they can recognize other bacterial patterns that are shared by entire classes of bacteria, which can be produced by commensal microorganisms as well. However, differences between these two groups of microorganisms or how exactly the host distinguishes receptor-activation between pathogenic and commensal bacteria, are not well understood. Members of the human CEACAM family share high similarities in protein structure as well as bacterial ligands in a species-specific manner (Kammerer et al., 2007; Kammerer and Zimmermann, 2010; Voges et al., 2010). CEACAM4 is an orphan receptor of the human CEACAM family and exclusively expressed by human granulocytes. The study of this CEACAM had been neglected during the last decades mainly due to the lack of CEACAM4 in rodents or a bona fide ligand for this membrane protein. Interestingly, CEACAM3 is a closely related receptor which is also restrictedly expressed on granulocytes and acts as a

127 single chain phagocytic receptor initiating the detection, internalization, and destruction of a limited set of human-restricted bacteria (Buntru et al., 2012). However, so far no CEACAM-binding bacteria are known to bind to CEACAM4, opening the question whether CEACAM4 can promote beneficial interactions between host and engaged microbes from normal microbiota or commensal bacteria.

Since gastrointestinal tract is the primary site of interaction, between the host immune system and microorganisms (symbiotic or pathogenic), gut microbiota play a crucial role to shape immune responses during health and disease (Round and Mazmanian, 2009). In the present study we aimed to identify a bacterial ligand of CEACAM4 from human gut microbiota by next generation sequencing of 16S rRNA genes. For this purpose, soluble proteins of the amino N-terminal domain of CEACAM4 was used to pull-down potential CEACAM4-binding bacteria from human stool samples of 10 healthy volunteers. 16S rRNA gene pyrosequencing was employed for bacterial identification. From around 600.000 sequences, cluster analysis of sequences (CLANS) and SILVA database identified sequences enriched in CEACAM4 samples compared to the input and sequences from two bacterial phyla (Firmicutes and Bacteroidetes) were identified. Unfortunately, subsequent biochemical analysis of bacterial candidates could not show any significant binding to N-terminal domain of CEACAM4. Therefore, our data could not provide further insight on the CEACAM4 binding profile under the conducted set-up. For futures studies, an adaptation of the used approach is needed to successfully identify potential microbial ligands of CEACAM4.

6.3 Material and Methods

Cell culture

Human embryonic kidney epithelial 293T cells (293 cells) were cultured in Dulbecco’s modified Eagle’s medium (DMEM) containing 10 % CS. Cells were grown at 37 °C in 5

% CO2 and subcultured every 2-3 days.

Recombinant DNA constructs

Mammalian expression plasmids encoding soluble GFP-tagged N-terminal domains of human CEACAM1, CEACAM3, CEACAM4 CEACAM5 (CEA), CEACAM6 or CEACAM8, as well as canine CEACAM1 were described previously (Kuespert et al., 2007; Voges et al., 2010). Plasmid pRc/CMV encoding cDNA of human CEACAM7

128

(provided by W. Zimmermann, Universitätsklinikum Grosshadern, München, Germany) was used as a template for PCR amplification of the N-terminal domain of CEACAM7

with primers CEA7-NT-IF-sense

5’-GAAGTTATCAGTCGACACCATGGGTTCCCCTTCAGCC-3’ and primer

CEA7-NT-IF-anti

5’-ATGGTCTAGAAAGCTTTGCTTGAAGTACTCTGGGTGCGAGAATACGTAGAATT GTCTGG-3’. The resulting PCR fragment were cloned into pDNR-Dual using the In-Fusion PCR Cloning Kit (Clontech, Mountain View, CA), verified by sequencing and transferred by Cre-mediated recombination into pLPS-3′EGFP (Clontech), resulting in a GFP fused protein to the carboxy-terminus of CEACAM7 (N-CEACAM7).

Expression of CEACAM soluble N-terminal domains

Expression of the soluble GFP fusion proteins in 293 cells were performed as described previously (Kuespert et al., 2007; Schmitter et al., 2004). Briefly, 293 cells were transfected by standard calcium-phosphate co-precipitation with 7 µg of plasmid DNA in 10 cm culture dish and 24 hour later, culture medium was replaced by OptiMem (Gibco BRL). After 72 hours, the cell culture medium (supernatant) containing the GFP-tagged N-terminal domain of CEACAMs (N-CEACAM) was collected and kept at 4 °C or for long term storage at -20 °C.

Bacterial strains and Growth conditions

Gram-negative bacteria, Neisseria gonorrhoeae non-piliated MS11 strains: CEACAM-binding Opa protein strain N309 (Ngo OpaCEA) or non-CEACAM binding strain N302 (Ngo Opa-) (Kupsch et al., 1993a) were kindly provided by Thomas Meyer (Max-Planck-Institut für Infektionsbiologie, Berlin, Germany) and cultured as described previously (Schmitter et al. 2004).

Anaerobic Gram-negative Prevotella strains: Prevotella intermedia (DSM-20706), Prevotella nigrescens (DSM-13386), Prevotella timonensis (DSM-22865) and Prevotella copri (DSM-18205) were obtained from the Deutsche Sammlung für Mikroorganismen und Zellkulturen (DSMZ). Prevotella strains were grown anaerobically at 37°C in Hungate Anaerobic tubes (16 x 125 mm complete tubes, Dunn Labortechnik, Germany) with modified peptone yeast based medium (104 PYG medium, DSMZ) supplemented with sodium thioglycolate (500 mg/ml) or in chopped meat carbohydrate broth (pre-reduced

129 medium in Hungate tubes, BD) for Prevotella timonensis. Prevotella were subcultured every 3 days.

Gram-positive bacteria, Megasphaera cerevisiae (DSM-20462), Megasphaera elsdenii (DSM-20460), Flavonifractor plautii (DSM-4000) and Eubacterium siraeum (DSM-15702) were obtained from the Deutsche Sammlung für Mikroorganismen und Zellkulturen (DSMZ). Megasphaera strains were grown anaerobically at 37 °C in Hungate Anaerobic tubes (16 x 125 mm complete tubes, Dunn Labortechnik, Germany) with peptone yeast based medium (104b PY +X medium, DSMZ). Flavonifractor plautii, Eubacterium siraeum and clostridium strains were grown anaerobically at 37°C in Hungate Anaerobic tubes (16 x 125 mm complete tubes, Dunn Labortechnik, Germany) with modified peptone yeast based medium (104 PYG medium, DSMZ). DSMZ mediums were supplemented with sodium thioglycolate (500 mg/ml) and bacteria were subcultured every 3 days.

Coupling of soluble GFP fusion CEACAM proteins to micro-magnetic GFP beads Anti-GFP magnetic beads were coupled with soluble GFP fusion proteins by a fast magnetic immunoprecipitation using the µMACS GFP Isolation Kit (Miltenyi Biotec). 50 µl of micro-magnetic bead suspension was incubated with soluble CEACAM domains or only GFP as a control (1:20 dilution) for 30 min on ice. Samples were loaded onto an equilibrated MACS column (200 µl lysis buffer in a MS 25 column, Miltenyi Biotec) placed in the magnetic field of a miniMACS separator. Each sample-bead solution was retained in the column during the washing steps (five times with 200 µl of wash buffer 1 and once with 200 µl of wash buffer 2). Columns were removed from the separators and samples were eluted with 50 µl PBS. After elution, CEACAMs and control-bead solution was stored at 4 °C for a maximum of 2 hour prior to use.

Enrichment of CEACAM-binding bacteria by magnetic bead isolation

1 gr of fresh human stool samples from healthy volunteers were diluted in 10 ml PBS and filtrated through an acrodisc PSF syringe filter (PALL Corporation) with a pore size of 10 µm. The filtrated stool samples were incubated with protein-coupled beads (1:20 dilution) for 30 min at RT under rotation. A MACS column (MS 25, Miltenyi Biotec) was placed in the MiniMACS separator and equilibrated with 200 µl of lysis buffer. Afterwards, the stool sample-protein-bead solution was applied onto the column. The column was washed three times with 200 µl PBS-T and two times with 200 µl PBS. The column was removed from

130

the separator and the bound bacteria were eluted with 50 µl double distilled water (ddH2O).

Samples were boiled for 10 min. Cell debris was precipitated (1400 x g, 10 min) and supernatants were collected.

454 next generation sequencing of microbial 16S rRNAs

The universal primer pair 8F-338R covering the hypervariable regions V1 and V2 of the 16S ribosomal RNA (16S rRNA) gene were used to amplify short amplicons (fragments of 368 bp) of bacterial DNA from 16S rRNA. These primers are fused with multiplex identifier (MID) adaptors for the GS FLX Titanium sequencing. DNA libraries were prepared with amplicon fusion primers MID-labelled (10-nuleotide sequence tag) (Table 6.1). Reverse primers contained the 454 Life Sciences GS FLX Titanum primer A sequences, followed by the MID-10-base barcode; a CA nucleotide-linker and the broad-range bacterial primer 338R (bold letters). Forward primer contained the 454 Life Sciences GS FLX Titanum primer B sequence, a TC nucleotide-linker and the broadly conserved bacterial primer 8F (bold letters). Aapproximately 100 ng of DNA template was added to a PCR reaction mix containing 1 µl Pfu polymerase, 5 µl of 10x buffer, dNTPs and 10 µM of the 16S rRNA primers to a final volume of 50 µl. Following an initial denaturation step at 94 °C for 3 min, PCR was cycled 30 times at 94 °C / 20 s, 55 °C / 30 s and 72 °C / 60 s, and a final extension at 72 °C for 5 min. Negative controls were run on both, bacterial DNA enrichment and 16S rRNAs PCR.

The amplified amplicons were purified with the QIAquick gel extraction kit (Qiagen) and quantified using the PicoGreen Assay Kit (Invitrogen). Subsequently, DNA libraries were immobilized onto beads, clonally amplified by emulsion PCR (emPCR amplification) and pooled for sequencing in a multiplex fashion on the genome sequencer FLX instrument (454 FLX Titanum pyrosequencer, Genomic Center, University of Konstanz). Samples were pooled in two regions of the PicoTiterPlate (PTP) device (GS FLX Titanium PicoTiterPlate Kit 70x75, Roche) and sequencing reads were done in duplicates. Following sequencing, pooled libraries were identified by their MID tag and correctly assigned.

Table 6.1 16S rRNA fusion primers

Amplicon Primer Sequence

Reverse PrimerA-MiD1-338R CCATCTCATCCCTGCGTGTCTCCGACTCA

GACGAGTGCGTCATGCTGCCTCCCGTA GGAGT

131

SILVA ribosomal RNA database were used to analyze the 454 sequencing reads in order to identify CEACAM-enriched bacteria. The replicates of pooled samples resulted in 600.000 sequences with an average length of 368 bp, for a total of more than 212 Million bases.

16S rRNA library sequences were analysed and sorted by fold-change, which showed significant differences in abundance of experimental conditions over control (CEACAM4-enriched bacteria vs. stool sample input and GFP-(CEACAM4-enriched bacteria). All sequences were clustered using the cluster database at high identity with tolerance (cd-hit).cd-hit at 97 % of identity (dataset). Number of sequences combined in each cluster provides the respective abundance of each fragment-type (cluster representative). By taking into account the respective abundances of fragments over the experimental conditions and replicate experiments (SET1 referring to plate-1 samples and SET2 to plate-2 samples), a standard T-test (using the Statistics::TTest module in Perl, testing for unequal variance) was used to

132

assign each change in abundance a p-value. A significant difference of 5 % in abundance was used between the control and experimental conditions (sequences of interest).

Using the 16S rRNA sequences present in the SILVA database (V104, Quast et al. 2013) we attempted to identify which bacteria was found to be associated. First, all SILVA small subunit (SSU)-rRNA sequences were clustered using cd-hit at 97 % identity. The cluster representatives (18 sequences) were combined with our fragment sequences of interest and the combined set was then analysed using cluster analysis of sequences (CLANS).

Afterwards, we checked whether any of our sequences of interest were not in the bacterial group. After this step, we only used bacterial sequences from the SILVA database. Taking the full set of bacterial sequences from the SILVA database, we then excluded sequences annotated as 'unidentified', 'environmental sample', 'unknown', 'metagenome' or 'uncultured' (363 sequences). Subsequently, the set was sorted using cd-hit at 97% identity, resulting in a total of 4167 sequences. The combined set of sequences was then clustered using CLANS. CLANS performs an all-against-all BLAST comparison of sequences and represents them as dots in 3D-space. By using this 3D CLANS map, all major bacterial families/groupings from the SILVA database were able to be identified and placed into a taxonomic context. Each sequence is assigned an 'attraction' value to every other sequence based on their reciprocal BLAST hits. The score of the hit was divided by the length of the hit to provide us with a score-per-column value. The score-per-column value was used as the similarity metric and determined the respective attraction of sequences to each other.

By equilibrating the graph in 2D or 3D space, sequences similar to each other move into close proximity while sequences of lesser similarity are located more distantly from each other. This leads to a clustered representation of sequence-space, with large clouds of dots representing large groups of sequences with greater than average similarity to each other.

Pull-down assays of CEACAM-associated bacteria

Associated-CEACAM bacteria identified by SILVA analysis were used for binding studies as described before (Kuespert et al., 2007). Briefly, overnight grown Neisseria were taken from a GC agar plate, or 48 hours grown anaerobic bacteria were taken from respective growing mediums and suspended in PBS. Colony forming units (cfu) were estimated by optical density (OD) readings according to their standard growth curve (Neisseria OD550; anaerobic bacteria OD600). Bacteria were added to cell culture supernatants containing CEACAM N-terminal domains in a total volume of 1 ml. After 1 hour of incubation at 20

Associated-CEACAM bacteria identified by SILVA analysis were used for binding studies as described before (Kuespert et al., 2007). Briefly, overnight grown Neisseria were taken from a GC agar plate, or 48 hours grown anaerobic bacteria were taken from respective growing mediums and suspended in PBS. Colony forming units (cfu) were estimated by optical density (OD) readings according to their standard growth curve (Neisseria OD550; anaerobic bacteria OD600). Bacteria were added to cell culture supernatants containing CEACAM N-terminal domains in a total volume of 1 ml. After 1 hour of incubation at 20