Literatur – DNA Computing
T. Head, Formal language theory and DNA: An analysis of the generative capacity of specific recombinant behaviors. Bull. Math. Biology 49 (1987) 737–759.
L. M. Adleman, Molecular computation of solutions to combinatorial problems. Science 226 (1994) 1021–1024.
T. Head, Gh. P˘aun and D. Pixton, Language theory and molecular genetics. In: G. Rozenberg and A. Salomaa (eds.), Handbook of Formal Languages, Springer-Verlag, 1997, Vol. II, Chapter 7, 295–360.
Gh. P˘aun, G. Rozenberg and A.Salomaa, DNA Computing - New Computing Paradigms. Springer-Verlag, Berlin, 1998.
Molecule with Thymine Base
C @
@@
HN C CH3
O C CH@
@@
N CH
CH H
O
H CH CH CH
O@
@@
P@
@@
O
O O
O@
@@
P@
@@
O
O O
1’
2’ 3’
4’
5’
Double Stranded DNA Molecule
3’ 5’
5’ - 3’
•
•
•
•
•
A T
•
•
•
•
•
P
P
5’
4’
3’
2’
1’
3’
5’
•
•
•
•
•
C G
•
•
•
•
•
P
P
•
•
•
•
•
G C
•
•
•
•
•
P
P
3’
5’
5’
3’
Measuring the Length of DNA Molecules by Gel Electrophoresis
large fragments small fragments negative
electrodes
positive electrodes
– - +
Polymerase
5’ 3’
CGGA
GCCTCTACCT
3’ 5’
-
5’ 3’
CGGAG
GCCTCTACCT
3’ 5’
-
5’ 3’
CGGAGA
GCCTCTACCT
3’ 5’
- ... -
5’ 3’
CGGAGATGGA GCCTCTACCT
3’ 5’
Polymerase Chain Reaction
γ
z }| {
β
z }| {
| {z }
γ
| {z }
β
@@
@@
@@R
denaturation by heating
γ
z }| {
β
z }| {
| {z }
γ
| {z }
β
?
annealing
?
annealing
γ
z }| {
β
z }| {
| {z }
β−primer
γ−primer
z }| {
| {z }
γ
| {z }
β
?
polymerase
?
polymerase
γ
z }| {
β
z }| {
| {z }
γ
| {z }
β
γ
z }| {
β
z }| {
| {z }
γ
| {z }
β
Endonuclease
5’ 3’
CATATG GTATAC
3’ 5’
?
NdeI
5’ 3’ 5’ 5’
CA TATG
GTAT AC
3’ 5’ 3’ 5’
5’ 3’
GGCC
3’CCGG5’
?
HaeIII
5’ 3’ 5’ 3’
GG CC
CC GG
3’ 5’ 3’ 5’
Hydrogen Bonding and DNA Ligase
C-A T-A-T-G
| | | |
G-T-A-T A-C
hydrogen -
bonding
C-A T-A-T-G
| | | | | | G-T-A-T A-C
ligase -
C-A-T-A-T-G
| | | | | | G-T-A-T-A-C
Splicing with Sticky Ends
A G C T T C G A α1 β1
C G C G G C G C α2 β2
? ?
TaqI SciNI
T C G A
A G C T
α1 β1 G C G C
C G C G
α2 β2
? ?
XXXXXXXXXXXX
XXXXXXXX
XXXXXXXXXXz 9
exchange
T C G C
A G C G
α1 β2 G C G A
C G C T
α2 β1
? hydrogen bondingDNA ligaseand ?
A G C G T C G C α1 β2
C G C T G C G A α2 β1
Splicing with Blunt Ends
A G C T T C G A
α1 β1 G G C C
C C G G α2 β2
? ?
AluI HaeIII
A G C T
T C G A
α1 β1 G G C C
C C G G
α2 β2
? ?
XXXXXXXXXXXX
XXXXXXXX
XXXXXXXXXXz 9
exchange
A G C C
T C G G
α1 β2 G C G A
C G C T
α2 β1
? hydrogen bondingDNA ligaseand ?
T C G G A G C C α1 β2
C C G A G G C T α2 β1
Adleman’s Experiment
4
3 1
0 6
2 5
V(2) TATCGGATCGGTATATCCGA
E(2,3) CATATAGGCTCGATAAGCTC
V(3) GCTATTCGAGCTTAAAGCTA
E(3,4) GAATTTCGATCCGATCCATG
Splicing Scheme and Splicing Operation I
Definition:
A splicing scheme is a pair (V, R), where – V is an alphabet and
– R is a subset of V ∗#V ∗$V ∗#V ∗.
The elements of R are called splicing rules.
Definition:
We say that w ∈ V ∗ and z ∈ V ∗ are obtained from u ∈ V ∗ and v ∈ V ∗ by the splicing rule r = r1#r2$r3#r4 and write (u, v) ⊢r w and (u, v) ⊢r z, if the following conditions are satisfied:
– u = u1r1r2u2 and v = v1r3r4v2, – w = u1r1r4v2 and z = v1r3r2u2.
Splicing Scheme and Splicing Operation II
For a language L over V and a splicing scheme (V, R) we set spl(L, R) = {w | (u, v) ⊢r w, u ∈ L, v ∈ L, r ∈ R}.
For two language families L1 and L2 we set
spl(L1,L2) = {L | L = spl(L1, L2) for L1 ∈ L1
and a splicing scheme (V, R) with R ∈ L2}.
Splicing Operation – Examples
L = {anbn | n ≥ 0} and R = {a#b$a#b}
spl(V, R) = {anbm | n ≥ 1, m ≥ 1}
L ⊂ V ∗ arbitrary, L′ ⊂ V ∗ arbitrary, (V ∪{c}), R), R = {#xc$c# | x ∈ L′} spl(L{c}, R) = {w | wz ∈ L for some z ∈ L′}
{anbn} ∈/ spl(L(REG),L(RE))
Generative Power of the Splicing Operation
Theorem:
The following table holds where where at the intersection of the row marked by X and the column marked by Y we give Z if L(Z) = spl(L(X),L(Y )) and Z1/Z2 if L(Z1) ⊂ spl(L(X),L(Y )) ⊂ L(Z2).
F IN REG CF CS RE
F IN F IN F IN F IN F IN F IN REG REG REG REG/CF REG/RE REG/RE
CF CF CF RE RE RE
CS RE RE RE RE RE
RE RE RE RE RE RE
Some Lemmas I
Lemma:
For any language families L1,L2,L′1,L′2 with L1 ⊆ L′1 and L2 ⊆ L′2, we have spl(L1,L2) ⊆ spl(L′1,L′2).
Lemma:
If L1 is closed under concatenation with symbols, then L1 ⊆ spl(L1,L2) for all language families L2.
Lemma:
If L is closed under concatenation, homomorphism, inverse homomorphisms and intersections with regular sets, then spl(L,L(REG)) ⊆ L.
Einige Lemmata II
Lemma:
If L is closed under homomorphism, inverse homomorphisms and intersections with regular sets, then spl(L(REG),L) ⊆ L.
Lemma:
For any recursively enumerable language L, there are context-free languages L1 and L2 such that L = {u | uv ∈ L1 for some v ∈ L2}.
Lemma:
For any recursively enumerable language L ⊂ V ∗, there are a context- sensitive language Sprache L′ and letters c1 and c2, which are not in V , such that L′ ⊆ L{c1}{c2}∗ holds, and for any w ∈ L there is a number i ≥ 1 such that wc1ci2 ∈ L′.
Splicing Systems
Definition:
A splicing system is a triple G = (V, R, A), where – V is an alphabet,
– R is a subset of V ∗#V ∗$V ∗#V ∗, and – A is a subset of V ∗.
Definition:
The language L(G) generated by a splicing system G is defined by the following settings:
– spl0(G) = A and spli+1(G) = spl(spli(G), R)) ∪ spli(G) for i ≥ 0, – L(G) = ∪i≥0spli(G).
Example:
G = ({a, b},{a#b$a#b},{(anbn)m | n ≥ 1, m ≥ 1})
L(G) = {ar1bs1ar2bs2 . . . armbsm | m ≥ 1, ri ≥ 1, si ≥ 1, 1 ≤ i ≤ m}
Extended Splicing Systems
Definition:
i) An extended splicing system is a quadruple G = (V, T, R, A) where – H = (V, R, A) is a splicing system and
– T is a subset of V .
ii) The language generated by an extended splicing system G is defined as L(G) = L(H) ∩ T∗.
Example:
G = ({a, b, c},{a, b},{#c$c#a},{cmanbn | n ≥ 1}
L(G) = {anbn | n ≥ 1}
Definition:
For two language families L1 and L2, we define Spl(L1,L2) (ESpl/L1,L2) as the set of all languages L(G) generated by some (extended) splicing system G = (V, R, A) (G = (V, T, R, A)) with A ∈ L1 and R ∈ L2.
The Power of Splicing Systems
Theorem:
The following table holds, where at the intersection of the row marked by X and the coloumn marked by Y we give Z if L(Z) = Spl(L(X),L(Y )) and Z1/Z2 if L(Z1) ⊂ Spl(L(X),L(Y )) ⊂ L(Z2).
F IN REG CF CS RE
F IN F IN/REG F IN/RE F IN/RE F IN/RE F IN/RE REG REG REG/RE REG/RE REG/RE REG/RE
CF CF CF/RE CF/RE CF/RE CF/RE
CS CS/RE CS/RE CS/RE CS/RE CS/RE
RE RE RE RE RE RE
The Power of Extended Splicing Systems
Theorem:
The following table holds, where at the intersection of the row marked by X and the coloumn marked by Y we give Z if L(Z) = ESpl(L(X),L(Y )).
F IN REG CF CS RE F IN REG RE RE RE RE
REG REG RE RE RE RE
CF CF RE RE RE RE
CS RE RE RE RE RE
RE RE RE RE RE RE
Some Lemmas III
Lemma:
For any language families L1,L2,L′1,L′2 with L1 ⊆ L′1 and L2 ⊆ L′2, we have ESpl(L1,L2) ⊆ ESpl(L′1,L′2).
Lemma:
If a language family L is closed under concatenation with symbols, then L ⊆ ESpl(L,L(FIN)).
Lemma:
L(REG) ⊆ ESpl(L(F IN),L(F IN)).
Some Lemmas IV
Lemma:
For any family L which is closed under union, concatenation, Kleene- closure, homomorphisms, inverse homomorphisms and intersections with regular sets, ESpl(L,L(FIN)) ⊆ L.
Lemma:
For any recursively enumerable language L ⊆ T∗, there is an extended splicing system G = (V, T, R, A) with a finite set A and a regular set R of splicing rules such that L(G) = L.
Lemma:
For any extended splicing system G = (V, T, R, A), L(G) is a recursively enumerable set.
Some Measures of Descriptional Complexity – Definitions
Definition: i) For a splicing system G = (V, R, A) or an extended splicing system G = (V, T, R, A) we define the complexity measures r(G), a(G) and l(G) by
r(G) = max{|u| | u = ui for some u1#u2$u3#u4 ∈ R, 1 ≤ i ≤ 4}, a(G) = #(A),
l(G) = max{|z| | z ∈ A}.
ii) For a language family L and n ≥ 1 and m ∈ {a, l}, we define the families Ln(r,L) and Ln(m,L) as the set of languages L(G) where G = (V, R, A) is a splicing system with r(G) ≤ n and A ∈ L and with m(G) ≤ n and R ∈ L, respectively.
Analogously, for m ∈ {r, a, l} and extended splicing systems, we define the sets Ln(em,L).
Results on Descriptional Complexities – Results
Theorem: For any n ≥ 1,
i) L(F IN) ⊂ Ln(r,L(F IN)) ⊂ Spl(L(F IN),L(F IN)), ii) Ln(r,L(F IN)) ⊂ Ln+1(r,L(F IN)):
Theorem: For L ∈ {L(REG),L(CF),L(RE)} and n ≥ 1, Ln(r,L) = L.
Theorem: For any n ≥ 1,
Ln(ea, L(REG)) = ESpl(L(F IN),L(REG)).
Theorem: For any n ≥ 2,
L1(el, L(REG)) ⊂ Ln(el,L(REG)) = ESpl(L(F IN),L(REG)):