Storage - Analysis of dynamic surrogate models

3.2 Analysis of dynamic surrogate models

3.2.5 Storage

The interpolants used by the numerical model require storage for data points or parame-ters. Theorem 4 formulates a condition for the case that the numerical model needs much fewer data pointsin fact, a lower dimensional spacefor the same accuracy compared to the storage of all output. This is one of the main motivations to construct the new model, instead of simply storing output and looking up the values when needed. To fulll the condition of Theorem 4, the input parameters of the original system must have at least one degenerate dimension. This means that the dynamic f of the original system depends on a set of parameters with lower dimension than the dimension of the parame-ter space used. See the previous section, section 3.2.2, for an introduction to degenerate parameter spaces.

If it is not clear whether the construction of the surrogate model will save storage or not, it is still informative to construct the model and check the condition of Theorem 4.

If the condition is fullled, there is a degeneracy in the parameter space. The degeneracy might not be apparent beforehand, as changing any of the parameters can still result in a dierent trajectory. The limit cycle system (equation 3.1) is another example where the reduction of needed storage capacity is signicant. Naively storing all observations over time for all initial states would take a three-dimensional hyper-surface: two dimensions

for the parameters, and one for time. The reduced model with closed observables only needs two 1D-lines for dynamic and observer (gure 3.10, center and right), and one 2D-surface for the translation from initial state to the new coordinate (gure 3.10, left).

Even in this low-dimensional case, the naive storage of all observations over time for all initial states would take up all memory of a supercomputer (Tianhe-2, state 2015). If the same naive sampling is applied together with closed observables, the data ts in the memory of a smartphone (state 2015, gure 3.16).

10² 10³ 10⁴ 10⁵ 10⁶ 10⁷ 10⁸ 10⁹

10¹¹ 10¹³ 10¹⁵ 10¹⁷

#points per dimension

total#pointsstored

naive

closed observables

maximum for a smartphone maximum for a supercomputer

distance between smartphone and supercomputer

Figure 3.16: Comparison of the capacity needed to store the generated data of the spiral example, compared to the storage of the surrogate model.

In this section, we formalize the storage requirements of an interpolant, and show the reduction of storage through the proof of Theorem 4 below. Consider a function f ∈C^k([0,1]^p,R^m) withk∈ N0,p, m∈ N. Let >0 and consider an interpolant f of f, such that

kf−fk_∞< . (3.31)

Iff is a black box (also sometimes called oracle) and can hence only be queried at distinct pointsxi ∈[0,1]^pto yield the valuesf(xi), the interpolant is constructed by sampling the space[0,1]^p with a sampling methodS. This method S needsS(, p) sampling points in the construction of the interpolantf. For the full-grid sampling method,S(, p)increases exponentially inp, since ifN dierent samples of one coordinate axis are considered,N^p points must be stored for thep-dimensional space. If the functionf has regularityk >0,

the error betweenf andf decreases with the orderO(N^−k/p). More advanced sampling methods such as sparse grids exist (Bungartz and Griebel, 2004), where the number of points for a comparable accuracy increases withO(N(logN)^p−1), mitigating the curse of dimensionality. For all sampling methods, decreasingpwithout changing the smoothness properties of the function is a good way to reduce the number of points that need to be stored forf in order to achieve an accuracy smaller than.

We approximate the output of a discrete dynamical system (equations 3.63.8), to nd an interpolantI for the function

I :N×P →Y, I(t, x(0)) =y(f^t(x(0)))≈I(t, x(0)). (3.32) If P is p-dimensional, the function I has p+ 1 parameters. Hence, an interpolant I will need S(, p+ 1) points for an approximation accuracy of at least . With closed observables, we only need S(, p) points for the same accuracy, which will be shown through Theorem 4. The reduction from p+ 1 to p dimension might not seem much, but the dimensionality reduction is for the time variable: as we construct a model, only its intrinsic state space must be sampled and stored. The time variable might need signicantly more sampling points than the intrinsic state space, and in that case, the reduction is signicantly larger compared to a reduction of one coordinate in an equally-spaced grid.

Theorem 4. Storage reduction LetP ⊂R^dim^P be the parameter space, and z₀ :P → z₀(P) the initial map of the surrogate model, mapping P into the intrinsic state space.

Denen0 := dim(P), andd:= dim(z0(P)). Given a desired interpolation accuracy >0 and a sampling method S as dened above, the interpolants φ_I, g_I and y˜_I need at most O(S(, n₀))points if the following condition is true:

d < n₀+ 1. (3.33)

In contrast to that, storing the output of the original model needs O(S(, n₀+ 1))points, where the additional dimension is approximating the time variable.

Proof. Denem:= dim(y(X)).

1. For the n0-dimensional input, we need exactlyd surfaces with dimension n0 each to store the mapping from input to the new variables.

2. We also needdsurfaces with dimensiondeach to store the dynamicgon the space z0(P).

3. Finally, we needmsurfaces with dimensiondeach to store the observer on the new space.

4. Since storing a nite number of surfaces does not add another dimension, the maximum dimensionality we need to store the new model ismax(d, n₀).

5. To store the observed values over time, we need m surfaces (n0-dimensional) for each iteration stepn, so in total we needn₀+ 1dimensions.

6. The exact number of hyper-surfaces including dimension for the new model is given by d·d+m·d+d·n0, where the underlined values stand fordimensions and the multiplication between a number and a dimension is noted by (·). Note that the dimension is not distributive, that is a·(b+c)6=a·b+a·c. The dimensions are additive, so a+b=a+b. This allows for a reformulation into

storage(new model)= (d+m)·d+d·n0 (3.34) 7. The exact number of hyper-surfaces including dimension for the storage of all

ob-servations is given bym·n0+ 1. Reformulation yields

storage(full output)=m·n0+ 1 (3.35) 8. The theorem follows from comparing and reducing the storage formulations in items

6 and 7.

The storage eciency (Theorem 4) is relevant since a lot of output needs to be stored for the numerical model to be accurate. The theorem states the conditions when it is highly advantageous to construct closed observables instead of simply storing the output.

The dierence can be as large as the dierence between the memory of a smartphone and a supercomputer (see gure 3.16).

Im Dokument Data-Driven Surrogate Models for Dynamical Systems (Seite 74-77)