• Keine Ergebnisse gefunden

Distance between compounds

Scopes of Compounds

2.7 Distance between compounds

The expansion process can be considered as a series of consecutive synthe-sis steps. In each step all those compounds are synthesized which can be produced by the set of available reactions using only those compounds as substrates which were provided by previous steps. Assuming that compound B is in the scope of compound A, that is B Σ(A), we define the distance d(A, B) from compound A to compound B by the number of required con-secutive steps to produce B exclusively from A. The distance d(A, B) is not defined if B is not in the scope of A. In order to calculate that distance it is sufficient to partially expand the network starting with the seed A until compoundB is reached. As the seed forms generation 1 the distanced(A, B) is by one smaller than the generation in whichB appears.

IfA is in the scope ofB and B in the scope ofA, both distancesd(A, B) and d(B, A) are defined but not necessarily the same. This asymmetry can be explained as follows. When producing B fromA, a number of additional end products are generally also synthesized. The direct inversion of this process would require these end products as substrates and therefore does not represent an expansion process starting from the only seed compoundB. Therefore, the synthesis of A from B in general requires different synthesis steps.

As the expansion process works as a breadth-first traversal through the network, the distanced(A, B) gives the smallest number of consecutive steps in which B can be synthesized from A. However, in general more than one reaction is added per generation. Several products of in parallel attached reactions may together be required for the synthesis of the target compound B. Therefore, the number of required reactions may be larger than the number of consecutive steps.

The expansion algorithm also attaches reactions and compounds which are not required in a synthesis of compoundB. In the appendix in section A.8 a method is given which can extract from the partially expanded network only those reactions which are necessary for the synthesis of B.

As examples the syntheses of citrate from pyruvate (figure 2.14a) and from pyruvate to citrate (figure 2.14b) are given. Both figures show only the reactions required for the corresponding synthesis. The production of pyruvate from citrate only requires 2 steps. The process cannot simply be inverted as acetate would be required as an additional seed. Therefore, for

a)

Acetaldehyde Pyruvate

CO2

Oxalate

H2O

Citrate Acetate

Oxaloacetate

b)

Pyruvate

CO2 Citrate

Oxaloacetate

Acetate

Figure 2.14: Paths for the synthesis of a) citrate from pyruvate and b) pyru-vate from citrate. The first direction (a) requires 4 steps while in the other direction (b) only 2 steps are needed. Clearly, path (b) cannot simply be inverted as acetate would be required in the seed.

the reverse process, acetate has to be synthesized first from pyruvate which results in a path of length 4.

Accordingly, the distances between all compounds in the network can be calculated. Figure 2.15 shows a histogram of the distances of all pairs of compounds for which a distance exists in the way defined above. The average of these distances, ¯d= 13.3, can be seen as diameter of the network.

It should however be noted that this definition is problematic as pairs of compounds which do not posses a distance do not enter this average. Thus, a very loosely connected network may still have a relatively small diameter which may appear counter intuitive.

More importantly, the distances observed here are significantly larger as reported in connection with smallworldness of metabolic networks as in Wag-ner and Fell [2001]. The reason for the reported small distances is mainly the fact that in their utilized graph theoretical representation (see section 1.2), many metabolites are connected through highly connected hub metabolites.

For example the two compounds glucose and FAD both participate in re-actions which require the hub metabolite ATP. Consequently, the two com-pounds are connected via ATP and have a distance of 2, even though they are chemically quite different. As the expansion process mimics the metabolic processes more accurately, its larger distances are probably more realistic.

As in the previous section, it can be assumed that the metabolic network initially possesses the functionalities of certain cofactors. Figure 2.16 shows a histogram of the distances if the functionalities of ATP/ADP, NAD+/NADH,

0 10 20 30 40 50 60 70 distance

0 5 k 10 k 15 k 20 k 25 k 30 k

number of pairs

Figure 2.15: Histogram of the distances between any two compounds, where the second compound can be synthesized from the first. Pairs for which a distance is not defined did not enter the histogram.

0 10 20 30 40 50

distance 0

50 k 100 k 150 k 200 k

number of pairs

Figure 2.16: Histogram of the distances between any two compounds, where the second compound can be synthesized from the first. Here the functional-ity of the cofactors ATP/ADP, NAD+/NADH, NADP+/NADPH and CoA is present (solid line). The dashed line shows the distribution of distances for pairs that were connected also without cofactors present. Pairs for which a distance is not defined did not enter the histogram.

NADP+/NADPH and CoA are present. Here, the average distance ¯d is 13.9.

In general, the distance of two arbitrarily chosen compounds can only become smaller if additional cofactors are present. However, the two histograms in-dicate that the number of connected pairs is much higher in the case with cofactors. In fact, 16% of all possible pairs are connected if the cofactors can be utuilized, whereas this ratio is only 2% if that is not the case. Con-sequently, the expansions reach much farther in the cofactor case leading to a higher average distance or diameter. When only considering pairs which were already connected without cofactors (dashed line in figure 2.16), the average distance is in fact smaller, namely ¯d = 9.9.

Hierarchies