• Keine Ergebnisse gefunden

Identification and in silico analysis of CERK1-INTERACTING LysM-RLK-LIKE RLCK1 CLR1

3 Results

3.1 Identification and in silico analysis of CERK1-INTERACTING LysM-RLK-LIKE RLCK1 CLR1

Despite the crucial role of Arabidopsis CERK1 in chitin perception and signalling, only little is known about its complex partners and possible downstream targets. A yeast two-hybrid screen with the CERK1 intracellular domain was initiated and performed by Hybrigenics (Paris, France) in order to identify putative intracellular interactors and components of the signalling cascade downstream of CERK1. The intracellular domain of CERK1 (amino acids 254-617) was used as a bait to screen a prey cDNA library from 1-week-old Arabidopsis seedlings. From the total clones obtained in the yeast two-hybrid assay, the single clone which contained a prey fragment of 1167 bp, corresponding to amino acids 83-456 of an uncharacterized protein kinase superfamily protein encoded by At3g57120 was further analysed (Figure 3). According to The Arabidopsis Information Resource (TAIR) genome annotation (Lamesch et al., 2012), At3g57120 is a single exon gene with a coding region of 1371 bp.

R e s u l t s| 57

Figure 3. Genomic sequence of At3g57120 obtained from TAIR10 with the At3g57120 prey fragment retrieved from a single clone in a yeast two-hybrid screen with the CERK1 intracellular domain. The 1371 bp coding sequence of the single exon gene At3g57120 is shown in black with the putative start codon marked in red and the stop codon shown in bold black. The 5’ and 3’ untranslated regions are indicated in orange.

The sequence of the 1167 bp long prey fragment obtained in the yeast two-hybrid assay is underlined in green.

Analysis with the NCBI Conserved Domain Database online tool (NCBI CDD, http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi; Marchler-Bauer et al., 2015) identified the At3g57120 protein as a serine/threonine protein kinase (calculated Expect (E) value of 4.65e-21). Analysis with the Basic Local Alignment Search Tool (BLAST, blastp NCBI, http://blast.ncbi.nlm.nih.gov/Blast.cgi; Altschul et al., 1997) using the At3g57120 amino acid sequence as query against the Arabidopsis proteome database revealed that the identified protein shares high homology to Arabidopsis LysM-RLKs (data not shown). Among the first four hits with the highest alignment scores are three described LysM-RLKs, as well as a yet uncharacterized putative receptor-like protein. LYK3 (At1g51940) showed the highest match to the At3g57120 sequence with a E value of 5e-45, followed by LYK1/ CERK1 and LYK5 with E values of 3e-31 and 4e-30, respectively.

58 |R e s u l t s

In particular, the amino acids in the kinase subdomains (I-XI) show a high level of conservation comparing the sequence of At3g57120 to that of the other LysM-RLKs (Figure 4). Interestingly, also an amino acid stretch N-terminal of the kinase subdomain I shares high homology among the depicted proteins (Figure 4). Protein kinases contain ten subdomains (Hanks and Hunter, 1995). Subdomains I-V are required for ATP-binding and subsequently for the activity of the corresponding kinase. Important conserved features involved in ATP binding are the GxGxxG-motif (P-loop) in subdomain I, a conserved lysine (K)

Figure 4. Amino acid sequence alignment of CLR1 (At3g57120) with the intracellular domains of the five Arabidopsis lysin motif-containing receptor-like kinases (LysM-RLKs). The kinase subdomains I-XI are shown as red boxes. The myristoylation motif of CLR1 was predicted with Podell and Gribskov, 2004 (http://plantsp.genomics.purdue.edu/myrist.html) and is indicated by a green box. Framed with a blue box is the putative glycine-rich nuclear localization motif (Cokol et al., 2000). The alignment was generated in Geneious 7.1.7 using the ClustalW algorithm (Kearse et al., 2012). Colouring was performed in Jalview 2.8.2 using the Clustalx settings with a conservation threshold of 30 (Waterhouse et al., 2009).

R e s u l t s| 59 in subdomain II and a nearly invariant glutamate (E) residue in subdomain III (Hanks and Hunter, 1995). All these motifs/residues are absent in the amino acid sequence of At3g57120, suggesting that it constitutes an inactive kinase. CLR1 also lacks conserved amino acids in subdomain VIb (catalytic loop), subdomain VII (magnesium binding loop) and subdomain VIII. The activation loop which spans subdomains VII and VIII and is involved in switching the kinase activity on and off (Taylor and Radzio-Andzelm, 1994) contains an insertion of several amino acids in At3g57120. Taken together, these variations in the kinase domain make it very unlikely for the At3g57120 protein to have enzymatic activity.

No transmembrane domain or extracellular domain was predicted for the protein encoded by At3g57120, making it a member of the class of receptor-like cytoplasmic kinases (RLCKs).

Phylogenetic analyses assigned At3g57120 specifically to the RLCK subfamily XII (Shiu and Bleecker, 2003). Due to the lack of an extracellular domain and sequence homology to the kinase domain of LysM-RLKs we named the protein encoded by At3g57120 CERK1-INTERACTING LysM-RLK-LIKE RLCK1 (CLR1).

In the TAIR10 genome annotation (Lamesch et al., 2012) CLR1 is predicted to be a protein of 456 amino acids. However, analysis with a plant specific myristoylation prediction tool (PlantsP; Podell and Gribskov, 2004) revealed a putative internal N-myristoylation motif that lies 23 amino acids C-terminal of the annotated N-terminus (Figure 4, green box). Typically, N-myristoylation is a co-translational protein modification where the N-terminal methionine is removed from the growing peptide and an N-myristoyltransferase (NMT) attaches a myristic acid residue to the now N-terminal glycine at position two (Johnson et al., 1994;

Thompson and Okuyama, 2000). However, it is also known that myristoylation can occur post-translationally. Here, a mature protein is enzymatically cleaved to expose a previously internal glycine residue (Zha, 2000; Martin et al., 2011). This raises the question whether the start codon annotated for CLR1 in TAIR10 is correct or the protein in fact starts at the methionine associated with the N-myristoylation motif. The methionine encoded by the originally predicted start codon is named M1 and the methionine at position 23 is named M2

for the remainder of this work (Figure 4). If CLR1 starts at M1, CLR1 is either not myristoylated or is subject to post-translational cleavage behind M2 in order to expose the internal glycine residue for N-myristoylation to take place. In the alternative scenario the open reading frame of CLR1 is misannotated and the actual start of the protein is M2.

60 |R e s u l t s

Besides the putative N-terminal myristoylation motif CLR1 harbours a glycine-rich segment inserted between the kinase subdomains IV and V (Figure 4). This stretch of amino acids might constitute an uncommon nuclear localization signal (NLS) with potential DNA-binding ability (Cokol et al., 2000). Glycine-rich sequences near the C-terminus have been reported to mediate nuclear import in some proteins, including the human heterogeneous NUCLEAR RIBONUCLEOPROTEIN (hnRNP) A1, which is involved in alternative pre-and mRNA splicing as well as in regulating telomere length (Siomi and Dreyfuss, 1995; Cokol et al., 2000). In Arabidopsis a hnRNP homolog, RNP1, and a glycine-rich (RNA-binding) protein, AtGRP7 were reported to have a glycine-rich sequence shown to be important for nuclear import of the two proteins which is similar to that present in hnRNP A1 (Ziemienowicz et al., 2003). For AtGRP7 a role in alternative (pre-mRNA) splicing was demonstrated, regulating a feedback loop which negatively controls the circadian rhythm (Heintzen et al., 1994; Heintzen et al., 1997).

Based on BLAST analysis (NCBI blastp, http://blast.ncbi.nlm.nih.gov/Blast.cgi; Altschul et al., 1997, the CLR1 protein is encoded by a single copy gene in Arabidopsis. However, related proteins can be found in other plant species (data not shown). The majority of these proteins are not yet characterized but predicted to be LysM domain receptor-like kinases of a similar length to the CLR1 sequence. Since, like CLR1, they probably only resemble the kinase domain of LysM proteins, these predicted proteins could represent orthologues of CLR1.