• Keine Ergebnisse gefunden

Homology of ET1 and ZMZR1 to other translated plant ESTs

3. Results

3.5. In silico characterisation of et1 and zmzr1 and their comparison

3.5.3. Homology of ET1 and ZMZR1 to other translated plant ESTs

Fig. 3.23: Comparison of the ET1 protein sequences from LC and B73(et1-m3) with ZMZR1 from et1-Ref. The protein sequences were deduced from their respective cDNA sequences. All the amino acid differences can be observed here. The amino acids shaded in yellow are the ones having the same conserved physiochemical property. Other differences, representing different physiochemical properties or gaps, have been shaded grey.

sativa (rice), Solanum tuberosum (potato), Triticum aestivum (wheat) and Zea mays (maize). In Arabidopsis, the proteins derived from incomplete ESTs were found to be identical with the one derived through conceptual translation of one of the genomic sequences. Therefore, only the complete conceptually translated protein sequence is depicted here.

The putative proteins depicted in Fig. 3.24 showed high homology among each other.

In some cases different amino acids were present at one position, which, however, belonged to a common physiochemical group. This made the property of the residue at these positions unchanged and, thus conserved. These homologies were especially high near the C-terminal end of these proteins, where, in a length of approx. 60 amino acids, a number of conserved amino acid residues were observed.

This was also the region showing homology to the zinc ribbon domain of TFIIS in ET1. Therefore, all the sequences were highly conserved in this zinc ribbon like domain. The ice plant protein, probably due to bad quality of the EST sequence, was only partly homologous to the conserved domain. Similarly one of the soyabean EST and the potato EST were incomplete and only contained a part of its zinc ribbon like domain (data not shown).

Taking the approximately 60 aa long conserved zinc ribbon domain among all the proteins as the reference point, the N-terminal as well as the C-terminal ends of a number of proteins were of variable lengths. A number of ESTs from sources like rice, maize, barley and tomato seemed to contain an incomplete N-terminal end of the sequence. This was concluded due to the absence of a complete ORF, the methionine residue at the start of the protein, and also through comparison with other complete ESTs. Despite this, all the incomplete sequences were long enough for analysing the homology among the different translated ESTs and ET1 and ZMZR1 proteins, as they all contained the conserved zinc ribbon like domain and seemed to contain the complete mature protein. All the putative proteins showed variable sequence homology among each other in the N- and C-terminal regions as well as variable sequence length, measured from the conserved zinc ribbon like domain (Fig.

3.24, next page).

ET1(LC) : ET1(B73) : ZmESTjuvL : ZMZR1 : ZmESTtassl : TaEST : At.putTF : LeESTroot : At.putPr : HvESTsedl : ZmESTmixt : LeESTelcPl : MtEST1mRN : MtESTinfcL :

* 20 * 40 * 60 ---MTTTAAGHGCCWAGIPPFALLP---- ---MTTTAAGYGCCWAGIPPFALLP---- --- ---MATTAAGYGCCLAGLPPFPLLP---- --- ---MAATAAAYSCSAAA---MANTAAGWSPVLAPIYSPVNTKPINF ---GNSKIFHNFNSCSSLSVTCTKTHVPICS ---MEATSLSSAATIISSSSSPLSIFSPKKRTDSSPPPRIVRLSNKKEDK ---METLTSSATTTPSLSVFAPKSKHLSSRKIVKFSVSRKNNGNESD MMMESLSSSSATTATLPSFSIFPSSSTRTSSSSSLSSKKTFHFRLPSSKRDDGNNDSESQSKSSN

ET1(LC) : ET1(B73) : ZmESTjuvL : ZMZR1 : ZmESTtassl : TaEST : At.putTF : LeESTroot : At.putPr : HvESTsedl : ZmESTmixt : LeESTelcPl : MtEST1mRN : MtESTinfcL :

* 80 * 100 * 120 * ----RILSTGRETP---PPRASLVASSSKLRALAPRLRV-SNRPRRLI-VSASSSGEANSDAVPS ----RILSTRRETPPPPPPRASLVASSSKLRALAPRLRV-SNRPRRLI-VSASSSGEANSDAAPS ---RRLI-VSASSSGEANSDAAPS ----GILSTRLRREPS-PPRVALVASSPKLRAPAPRLRV-SCGPRRL----ASSSGKANSDAVPS ---SCGPRRL----APSSGKANSDAVPS ---AALPFPGSPFARRSPPSRVHLASSNPKLGKPVPSLRAS-YRRRRPHVRACSSEVDPDASAAS HFSASFYKPPRPFYKQQNPISALHRSKTTRVIEVVTPKQRNRSFSVFGSLADDSKLNPDEESNDS LQFPSSFRSNSYRFCIKSTRSSTIYGKRRASEHLFRLPVIS--CVVEDSSETQPDAVNSSASSDS DYDPQHSESNSSSLFRNRTLSNDEAMGLVLSAASVKGWT-TGSGMEGPSLPAK---TDTDTVS ---SAPG-T-TGSGMEGPP-TA---GGAANRPEVS ---ISPLSKDAAMGLVVSAATGSGWT-TGSGMEGPP-TASKAGGA-GRPEVS LQSDANDNTSIVPIFNNPTLSKDAAMGLVLSAANVRGWT-TGSGMEGPPVPAGSDSE-SNTDQIS QINFNLSPVPTNRCFSISPLSNDAAMGLVLSAATGRGWT-TGSGMEGPPVPAVGKDGQSGTENIS ---SISPLSNDAAMGLVLSAATGRGWT-TGSGMEGPPVPAVGKDGQSGTENIS

ET1(LC) : ET1(B73) : ZmESTjuvL : ZMZR1 : ZmESTtassl : TaEST : At.putTF : LeESTroot : At.putPr : HvESTsedl : ZmESTmixt : LeESTelcPl : MtEST1mRN : MtESTinfcL :

140 * 160 * 180 * PTEAAIDIKLPRRSLLVQFTCNACGERTKRLINRVAYERGTVFLQCAGCQVYHKFVDNLGLVVEY PTEAAIDIKLPRRSLLVQFTCNACGERTKRLINRVAYERGTVFLQCAGCQVYHKFVDNLGLVVEY PTEAAIDIKLPRRSLLVQFTCNACGERTKRLINRVAYERGTVFLQCAGCQVYHKFVDNLGLVVEY PTEATIDIKLPRRSLLVQFTCNACGERTKRLINRVAYERGTIFLQCAGCQVYHKFVDNLGLVVEY PTEATIDIKLPRRSLLVQFTCNACGERTKRLINRVAYE-WTIFLQCAGCQVYHKFVDNLGLVVEY PAEASFDIKLPRRSLLVQFTCTKCDARTERLINRVAYERGTVFLQCAGCQVYHKFVDNLGLIVEY AEVASIDIKLPRRSLQVEFTCNSCGERTKRLINRHAYEKGLVFVQCAGCLKHHKLVDNLGLIVEY SKEAVFDMKLPRRSLLATFTCNACGARSQRLINRLAYERGTVFIQCSGCSQYHKLVDNLGLV---TFPWSLFTKSPRRRMRVAFTCNVCGQRTTRAINPHAYTDGTVFVQCCGCNVFHKLVDNLNLFHEV TLPWSLFTKSPRRRMRVAFTCNVCGQRTTRAINPHAYTDGTVFVQCCGCSIFHKLVDNLNLFHEM TLPWSLFTKSPRRRMRVAFTCNVCGQRTTRAINPHAYTDGTVFVQCCGCNVFHKLVDNLNLFHEM TFPWSLFTKSPRRRMRVAFTCNVCGQRTTRAINPHAYTDGTVFVQCCGCNVFHKLVDNLNLFHEM TFPWSLFTKSPRRRMLIAFTCTICGQRTTRAINPHAYTDGTVFVQCCECNAYHKLVDHLN---TFPWSLFTKSPRRRMLIAFTCTICGQRTTRAINPHAYTDGTVFVQCCECNAYHKLVDHLNLFQET

ET1(LC) : ET1(B73) : ZmESTjuvL : ZMZR1 : ZmESTtassl : TaEST : At.putTF : LeESTroot : At.putPr : HvESTsedl : ZmESTmixt : LeESTelcPl : MtEST1mRN : MtESTinfcL :

200 * 220 * DLREENELQGENAVDTSSED--- DLREENVVQGENVIDTNSED--- DLREENGVNTCAED--- DFRETSKDLGTDHV--- --- KYYVSSSSFDYTDAKWDVSGLNLFDDEDDDNAGDSNDVFPL-KCYVGPD-FRYEG-DAPFNYLDSGDDDGSGN---IFPLV KCYVGPD-FRYEG-DAPFNYLDRNEDGDS---IFPR- KCYVSPDFNPNPDNDIGFKYFDMDDDND--- ---NCYLNS-SFKYKGPGWDDLKLRFMDIDSDDDD----DVFPVT

Fig. 3.24: Comparison of the ET1 and ZMZR1 proteins with translations of ESTs and genomic sequences obtained from online databanks. The names of sequences derived from the cloned ET1 and ZMZR1 cDNAs are written in blue. The other sequences in the alignment have been arranged based on their homology to ET1. The sequences from top to bottom are: ET1 from LC and B73 lines, ZmEST1: maize EST from juvenile leaf and shoot (cultivar: W64A) (gi:15313257, gb:BI478635.1), ZMZR1 from et1-Ref background, ZmEST2: maize EST from 1-3 mm tassel primordia (cultivar: OH43) (gi:9953030, gb:BE639613.1), TaEST: wheat flag leaf EST (gi:9846808, BE591735.1), At.putTF: Arabidopsis putative transcription factor (gi:6524186, gb: AAF15071.1), LeEST root: tomato root EST (gi:7333989, gb:AW622342.1), At.putPr: Arabidopsis putative protein (gi:7485299, T01795) and ESTs from mixed tissue (gi:239321, AA585805.1; gi:2413159, AA597736.1), HvESTsedl: barley EST from green seedling leaf (gi:11197727, BF266732), ZmEST3:

maize EST from mixed adult tissue (cultivar:W23) (gi:6127241, gb:AW129887.1), LeESTelcPl: tomato EST from leaf inoculated with disease response elicitors (gi: 6061943, AW096348), MtEST1mRN:

Medicago root nodule EST (gi: 10698688, BE998412) and MtESTinfcL: Medicago leaves after inoculation with Colletotrichum trifolii (gi: 11608735, BF520052).

Similar residues at a position in the alignment are shaded. The more common the homology among all the sequences, the darker is their shading. The common residues present in all the sequences are shaded black. The residues in the shaded column are sometimes not identical, but represent amino acids belonging to a common physiochemical group (Gene Doc: Nicholas et al., 1997). The homologies among all the proteins are the highest in the zinc ribbon domain region. At a number of positions, different residues, but with conserved physiochemical properties are observed.

Based on their homologies, two groups of sequences were identified from the alignment. The first eight sequences belong to the first group, where the sequences are more homologous to ET1 than to the remaining six ESTs in the second group. The last six sequences of the second group were more homologous to each other than to the ET1 group. Despite these small differences, all the sequences are highly homologous and contain a conserved zinc ribbon-like domain. Both the N-terminal and the C-terminal regions of the proteins around the zinc ribbon domain are variable in length as well as in percent homology between the two groups as well as among the sequences within each group.

Since the N-terminal sequences of ET1 and ZMZR1 proteins were known to be transit sequences for plastid localisation, the proteins obtained from the databanks were also analysed in silico for the presence of plastid targeting transit sequences. In TargetP analysis, all these plant proteins, including the partly incomplete ones were interpreted as plastid localised. The probability of being targeted to the plastids as compared to other cellular compartments was the highest. , In the case of proteins deduced from complete ORF sequences, the certainty for plastid localisation was higher than that obtained for the ET1 and ZMZR1 proteins (Appendix E). Based on

the analysis, the length of the transit sequences, for proteins with complete ORF, was variable among all the proteins, ranging from 35 to 80 aa residues (data not shown).

From all the homologous sequences obtained from the database, three different ESTs could be identified from Zea mays alone. From these, the translation of two incomplete ESTs showed more than 90% homology to ET1 and ZMZR1 protein sequences (Fig. 3.24). One of these clones, isolated from juvenile leaf and shoot tissue, was clearly more homologous to ET1, whereas the other, isolated from 1-3 mm tassel primordia, was more homologous to ZMZR1 protein. Their cDNA sequences also showed high homology to the et1 and zmzr1 cDNAs respectively.

Another EST clone from Zea mays, isolated from mixed adult tissue (tassel, kernel, silk, husk, root, leaf in the ratio 4/2/1/1/1/1), was also found to be homologous to ET1, but was less homologous as compared to the other two Z. mays EST clones.

Analysis of ESTs from other plant sources, like Arabidopsis, barley, Medicago, soybean and tomato also indicated the presence of two different ESTs from each plant. One of these two ESTs was more homologous to ET1 and ZMZR1, and the other more homologous to the third EST from Z. mays, so that they could be grouped into two different groups. From rice (immature leaf and apical meristem, gi:3763200, dbj:AU029952.1, not depicted) and wheat only one EST each was obtained, which was more homologous to the ET1 group of proteins. However, a few EST clones from other plant sources not depicted here, like ice plant (6 week old, gi:4464843, AI547355), potato (leaves, gi:13610149, BG592009.1), tomato seeds (gi:5894937, AW036095) and tomato red ripe fruits (without seeds and locules, gi:6976620, AW441369), were more homologous to the third EST clone from Z. mays (Fig. 3.24).

Apart from these ESTs, one EST from potato (axilary buds representing developing stolons, gi: 9249551, BE340020.1; gi:9250329, BE340798.1) was found to be slightly different from both the groups of proteins in the zinc ribbon like domain.

Therefore, based on the homology observed in the alignment in Fig. 3.24, two main groups of sequences could be classified. The first group contained ESTs more homologous to ET1 protein and the second, containing ESTs more homologous to the third maize EST. The region of the protein sequences showing the differences between the two groups extended throughout the protein. However, the differences were clearer in the mature protein region. The transit sequences were less conserved among the proteins and showed more variability in the sequence length as well as

amino acid consensus. Despite this, the transit sequences of the individual groups were more homologous among each other than with those of the other group.

Based on the homology of ZMZR1 to ET1 and the protein structure analysis carried out with ET1 (see section 3.3.3), ZMZR1 was also found to contain a secondary structure like that of ET1 and was also homologous to the TFIIS/Rpb9 zinc ribbon domain.