• Keine Ergebnisse gefunden

3.3 Parhyale hawaiensis sine oculis/Six and unplugged/gbx genes

3.4.3 Characterisation of Ph kni1

The comparison of all recovered Ph kni1 sequences led to the identification of 124 sites of nucleotide exchange within the complete Ph kni1 cDNA, corresponding to 5.6% nucleotide variation. 72 of those occur uniquely and are therefore considered sporadic (5.3.3.2). They are found predominantly in the ORF (51) where they account for 28 amino acid changes (missense alterations) and one premature stop (nonsense alteration). 22 nucleotide exchanges found within the ORF do not alter the translated amino acid sequence (silent exchanges).

149

These findings go well with the expected randomness of artificial nucleotide exchanges, suggesting that they are artefacts. The remaining 52 nucleotide exchanges are found in more than one independent sequence and are therefore considered polymorphic. They are predominant in the 5' UTR (42). Of the six polymorphic nucleotide exchanges present in the ORF, five do not alter the translated amino acid sequence. The remaining nucleotide exchange leads to an alternation of valine and phenylalanine for the amino acid residue at position 212, which is not within one of the described KNI protein domains (ZNFD, KNID, PIDLS motif).

These findings strongly suggest that they represent naturally occurring polymorphisms of wild type Ph kni1 alleles (5.3.3.2).

There is no indication for the existence of differentially spliced Ph kni1 transcripts or other types of Ph kni1 transcript isoforms since the comparison of all recovered Ph kni1 sequences supports a consistent Ph kni1 transcript with only minor variations in 5' and 3' extension.

Based on these findings, the complete sequence of the clone Ph_kni1_cDNAJZB5 was chosen as the Ph kni1 reference sequence (Ph kni1_ref). It covers all fractions of cDNA sequence that are shared by at least two independent Ph kni1 cDNA fragment clones. Except two sporadic nucleotide exchanges within the 3' UTR, it only carries putatively allelic polymorphisms and therefore translates into the consensus Parhyale Kni1 protein sequence.

As an existing cDNA clone sequence, it is preferred over an in silico modified consensus sequence and, for this reason, not altered further. Ph kni1_ref has been used for phylogenetic studies.

The Ph kni1 transcript is 2.2 kb in length and encodes a protein of 397 amino acids. The Parhyale Kni1 protein has a typical C4-type N-terminal Zinc finger domain (ZNFD, 66 amino acids) with closest resemblance to Knrl ZNFD. Directly adjacent to the ZNFD, Parhyale Kni1 has a knirps box domain (KNID, 19 amino acids) with high sequence similarity to the KNID of described kni family proteins of other species. C-terminally, a conserved PIDLS motif is found within the hydrophobic region of the Parhyale Kni1 protein. This is typical for Kni proteins of arthropods (Figures 67, 68; 3.4.6).

5' 3'

Figure 67: Schematic view of Ph kni1 transcript. The length of the Ph kni1 transcript is 2228 bases. Shown are: 5' UTR in grey (length 698 bases, nucleotide positions 1-698), ORF in black (length 1194 bases, encoding 397 amino acids; nucleotide positions 699-1892)

150

and the 3' UTR in grey (length 336 bases, nucleotide positions 1893-2228). The ORF region encoding the ZNFD is depicted in blue (length 198 bases, encoding 66 amino acids; nucleotide positions 729-926). The ORF region encoding the KNID is depicted in orange (length 57 bases, encoding 19 amino acids; nucleotide positions 945-1001). The ORF region encoding a conserved PIDLS motif is depicted in pink (length 36 bases, encoding 12 amino acids; nucleotide positions 1716-1751).

0001 AGTTGTGCTCGTGCTGGCCGAAAGAGTGCATACTCACGCGTCCGAGCTGTGGTTTGCACTTCCTGGCA

Figure 68: Ph kni1 cDNA and derived amino acid sequence. The sequence is in FASTA format and represents the Ph kni1 cDNA, derived from the mRNA transcript. 5' and 3' UTR are shown in grey, ORF in black. The translated amino acid sequence is printed bold and above the corresponding nucleotide sequence. Individual amino acids are above the central nucleotide of the respective codon. The putative start and stop codons are shown in green and red, respectively. The ZNFD is shown in blue, the KNID in orange and a conserved motif covering the amino acids PIDLS in pink. The cysteine residues within the ZNFD are highlighted in blue. The nucleotide sequences that encode these domains are shown in the respective colours. Numbers to the left give the relative nucleotide and amino acid sequence positions and share the font parameters of the corresponding sequence. The ends of the amino acid and the nucleotide sequences are indicated by numbers to the right of the corresponding line.

151 3.4.4 Isolation of Ph kni2

From two-step 5' RACE, five independent Ph kni2 5' cDNA fragment clones were obtained. They all show identical transcription starts. Any two of their sequences may vary up to 2%, which is slightly above average as compared to other RACE sequences recovered in this work. Still, this finding suggests that all sequences derive from transcripts of the same gene.

Ph kni2 3' cDNA sequence was fully recovered by two-step 3' RACE. Five independent Ph kni2 3' cDNA fragment clones were obtained. However, only from clone Ph_kni2_3R21 the complete sequence was retrieved. Minor sequence sections of the other clones remained of low quality despite repeated sequencing efforts. Since all parts of the entire 3' cDNA sequence are covered by at least two independent Ph kni2 3' cDNA fragment clones, the sequence information was considered sufficient for assembling the complete Ph kni2 cDNA sequence in silico. Within the fraction of sequence they share, any two 3' RACE sequences vary in less than 1% of nucleotide positions and are therefore considered identical. There are minor variations in poly(A) tailing.

Sequence information provided by all recovered 5' and 3' Ph kni2 RACE clones was sufficient for assembling the complete Ph kni2 cDNA sequence in silico. In order to verify the consistency of the established Ph kni2 cDNA sequence, coherent Ph kni2 cDNA and ORF fragments were isolated via long-distance PCR reactions (5.2.3.4), performed on Parhyale cDNA collections (5.2.2). Two independent Ph kni2 cDNA sequences and three independent Ph kni1 ORF sequences were obtained (A2.12.1). They are consistent with the findings from 5' and 3' RACE (A2.12.2). Because the nucleotide variation between any two ORF or cDNA sequences, respectively, does not exceed 1%, they are considered identical (5.3.3.3, A3, table A1).