Experiments - Model reduction for nonlinear systems : kernel methods and error estimation

3.4 Applications

4.1.1 Experiments

in the sagittal plane, cf. the MBS in Figure 4.2. Thus, the needed parameter-ization of the discrete angleϕ₁ as well as the discrete displacements u₂ and u₃ of the MBS is given by

uS = [u2 + (cosϕ1 −1) ¯X2 −sinϕ1X¯3]

| {z }

¯ u_S2

+ [u₃ + sinϕ₁X¯₂ + (cosϕ₁ −1) ¯X₃]

| {z }

¯ uS3

e₃.

After a deformation state of the MBS is applied to the IVD, its mechanical de-formation behaviour, i.e. its reaction stress, is computed using the FE model of [89, 88]. Herein, the integral behaviour P_C of the IVD is captured by applying a homogenisation procedure to the inhomogeneously distributed surface traction vector¯t, which is the reaction stress of the IVD due to the applied deformation u¯S, cf. Figure 4.3b). Following [91], the discrete me-chanical reaction of the IVD in terms of a force R = R_ie_i and a moment M = M_ie_i is computed via an integration of the surface traction vector¯t, viz.:

R = Z

∂Ω

¯tda and M = Z

∂Ω

¯r×¯tda . (4.1) Finally, the relationship between the DOF of the MBS as input quantitiesP_D and the integral mechanical responseP_Cresulting from the homogenised FE model (4.1) as output quantity writes as

f : P_D −→ P_C

(ϕ₁, u₂, u₃) 7−→ (M₁, R₂, R₃).

4.1. HUMAN SPINE MULTIBODY SYSTEM 123 to quasi-static simulations). As comparison we use the results from the Poly-nomial Chaos Expansion (PCE) approximation [180, 123], which has already been applied in this context in [91]. We ran the VKOGA Algorithm 6 with both thef andf /P selection criteria for different kernels and hyperparam-eters. For the Gaussian we set a γ range of [15,70], of which we chose 150 linearly spaced values. The Wendland kernel has been used with 90 config-urations in total, choosing30linearly spaced hyperparametersγ ∈ [15,70], each in combination with smoothness’s k ∈ {1,2,3}. Naturally for this setting, the dimension d = 3 was chosen for the Wendland kernel. The maximum expansion size was fixed toM = 600for both, along with a max-imum absolute and relative error of 10⁻⁵ on the training data as stopping criteria, where the relative error is computed via

erel = max

x∈X

f(x)−f˜(x) 2

.||f(x)||₂.

In the case of the -SVR, we fixed = 0.013375 and preliminary experi-ments showed that the resulting approximation errors were quite sensitive to the combination of hyperparametersγand regularization factorsλ. Con-sequently, we decided to use an extensive set of configurations and chose25 linearly spaced samples for both λ and from [10⁻⁶,10⁻³] and [10⁻³,0.1], respectively. Additionally, we chose to take25linearly spacedγvalues from [10,35]as Gaussian hyperparameters, yielding a total of15625different con-figurations. For each, a maximum iteration count of 60000and δ = 0.0001 were fixed as stopping conditions.

Table 4.1 shows details for the approximation results, including run-times and various error measures for the best configurations, respectively. Among all configurations and all algorithms, the best configuration corresponds to the minimum maximal absolute L^∞ error (L² point-wise) over X. The ex-tensive computation time for the -SVR is due to the large amount of dif-ferent configurations. In average, one -SVR solve took 108s (which would come up to e.g 3h for 100 configurations), so the SVR is yet more expen-sive to compute than the VKOGA variants. Note, however, in [162] a C/C++

Kernel type: Wendland kernelK_w Gauss kernelK_g

Algorithm: f-VKOGA f /P-VKOGA f-VKOGA f /P-VKOGA SVR

Configurations 90 90 150 150 15625

Time 168s 70.1s 9.1s 1.53s 468h

Best

configuration

γ=37.7586, k=1, d=3

γ=30.1724, k=1, d=3

γ= 16.4765 γ=15 γ=10, λ=0.00075 Expansion size

600 586 120 1 876

Max.L²error 4.80567 33.4541 79.3389 37511.3 1224.26

Max. relative L²error

5.42474 14.0741 16.0223 41930 479.266

Table 4.1: Comparison of approximation results for different kernels and modes.

implementation is presented that is shown to outperform LibSVM in some cases.

Moreover, the VKOGA algorithms using the Gaussian kernel stopped early, indicating a wrong choice of kernel space. If the correct space (i.e. kernel hyperparameter) is chosen, it can be expected that errors in the vicinity of already included points are comparably small. If not, subsequent iterations cause many closely neighboured points to be included in order to “fix” the local errors. This effect is even more severe using thef /P-VKOGA variant as the weighting by the orthogonal remainder norm kφ˜xk_H penalizes such errors even more (see Section 2.2.2.1). Numerically, this causes the follow-ing: While the expression

√z_i_max −p_i_max in Algorithm 6 (page 76) is ana-lytically always positive (see (2.40, p.74)), it gets close to zero very fast for closely neighbored centers. Now, as the valuepi_max is successively accumu-lated over the iterations, values slightly greater than one may occur, leading to complex roots and hence an emergency stop of the algorithm. Again, this effect is especially inconvenient using thef /P-Greedy variant, since all the residuals have to be weighted before selecting the next center, cf. lines 22-23 of Algorithm 6. Of course, it is possible to usep

max{eps, z_i_max −p_i_max}, but experiments showed that no real improvements are made once the first val-ues p_i_max > 1appear. Now, due to the hierarchical structure of the VKOGA algorithm, is is easy to select the number of points that yielded the best er-ror at a possibly intermediate algorithm step. This also explains the small

4.1. HUMAN SPINE MULTIBODY SYSTEM 125 number of centers, as120points or just one sample have been found to yield the smallest overall error until the numerical issues occurred. Even though the described effect is not as severe for the f /P-VKOGA variant using the Wendland kernel, yet both f /P-variants using either Gaussian or Wend-land kernels are outperformed by their f-VKOGA counterparts regarding the maximum errors. Hence, we will narrow down the further discussion to those variants and shall also include the PCE approximation from [91] for comparison.

Now, Figure 4.4 and Table 4.2 provide more insight to the approximation quality of the selected algorithms. Figure 4.4 which shows the absolute and relative approximation errors onX (L² in P_D) in the upper and lower row, respectively. Table 4.2 additionally states the mean and median errors for the respective methods, corresponding to the red and green lines in the plots of Figure 4.4. Most importantly, we sorted the data points in Figure

Figure 4.4: Plots of absolute and relative approximation errors (top/bottom rows) on X for different approximation methods, where the training points of both sets are sorted ascending by norm ofY entries (added in magenta in top right plot). The red/green lines are the mean/median and the black lines in the upper row denote the respective absolute errors sorted ascending by value.

4.4 by the norm of the corresponding Y-values in ascending order, which are also added in magenta in the top right image (absolute PCE errors). The solid black lines in the upper row images show the same absolute errors but sorted by value. It becomes clear that all large relative errors occur at close-to zero function values, while the relative errors are decreasing for higherY

values for each method. Hence, the maximum relative errors are less suit-able as quality measure than mean or median errors in this case. Even more, considering the median error of0.09671for the-SVR, it does not only out-perform the PCE approximation w.r.t. the same measure, but shows that the result is not as bad as suggested by the mean relative error. Interestingly, the PCE is very precise at the smallest values ofY, where on the other side the SVR solution yields a somewhat constantly higher error level also for close-to-zero training data. With respect to the maximum absolute error,

Absolute errors Relative errors

max mean median max mean median

f-VKOGA, Gaussian 79.3389 14.4025 11.2859 16.0223 0.10017 0.00436 f-VKOGA, Wendland 4.80567 1.33274 1.10283 5.42474 0.02711 0.00031 -SVR, Gaussian 1224.26 353.849 281.309 479.266 2.55756 0.09671 PCE 5240.14 705.049 424.474 1.00000 0.21536 0.18245

Table 4.2: Approximation error overview for selected algorithms and configurations

all approximation methods outperform the PCE approximation, where the VKOGA results are yet about two orders of magnitude better than the SVR result. While the relative errors of the SVR are worse than the PCE, both VKOGA approximations are also better in this case. It clearly shows the su-periority of the f-VKOGA-approximation using Wendland kernels, which only has a mean relative error of2%onX compared to21%of the PCE and matches at least three digits on half ofX.

Analytically, the VKOGA solutions should have zero error at each of the 600 support vectors. This is visually confirmed for the Wendland kernel case, where the black line of sorted absolute errors is around 10⁻⁷ for the first 600 points. This effect is less visible for the Gaussian approximation, which indicates that the numerical stability becomes an issue during evaluation of the expansion. This is confirmed considering the resulting expansion coef-ficients in the direct translate basis: While the Wendland kernel expansion has an average coefficient vector norm ¹

i=1||c_i|| = 3.83×10⁶ with en-try values (over all coefficient vectors) in the range [−1.97 × 10⁸,1.97 × 10⁸], the same values are 1.03 × 10¹⁵ and [−7.07 × 10¹⁵,5.46 × 10¹⁵] for

4.1. HUMAN SPINE MULTIBODY SYSTEM 127 the Gaussian version. As the interpolation conditions do not apply for the SVR algorithm, the numerical condition is even better considering values of 4.77 ×10⁵ as mean coefficient vector norm and entry values ranging in [−2.13×10⁷,2.13×10⁷] = [−C, C]with C := _2λn¹ , cf. Proposition 2.1.2.

Next, Figure 4.5 shows different response surfaces for thef-VKOGA approx-imation using Wendland kernels. Shown are the responses for (u₂, ϕ₁) →

Figure 4.5: Plots of approximation using Wendland kernels and best configuration γ : 37.7586, k : 1, d : 3. Top: (u₂, ϕ₁) → R₂, (u₂, u₃) → M₁ and (u₃, ϕ₁) → M₁ for each a fixed third/leftover input dimension value. Bottom:(u2, ϕ1)→R3for differentu3.

R₂, (u₂, u₃) → M₁ and (u₃, ϕ₁) → M₁ for each a fixed third/leftover input dimension value. The lower row shows the approximations for (u₂, ϕ₁) → R3, with the third argument u3 chosen over three increasing values from the training data range of u₃. The red dots are the projected locations of Y, and the black dots show the projections of function values on the selected centers along with a black line drawn to the actual approximated value on the respective center. Note here that only in the lower row we plotted all training function values of the selected component. The other plots show only a selection of (center-) values, whose corresponding data points are in the vicinity of the currently selected third dimension values. The top row illustrates the range of different nonlinear response characteristics, which are all successfully covered by the considered approximation. The clusters

of black markers around the response surface in each image of the lower row show that the actual process to approximate is continuous in the sense that data points with similaru₃ values lead to local areas of function values.

Note that this does not imply “locality” of the kernel expansion; taking a closer look into the expansion details reveals quite the opposite. The bound-ing box ofX has a diameter of approximately22, while the kernel width of γ = 37.7586 for the best solution is almost double that size, cf. Table 4.1, which clearly shows that each center of the expansion contributes to every function value.

Finally, we shall look into the speedup gained using the surrogate models.

So far, there have been no directly coupled full-scale simulations of the MBS yet, and the application of the PCE in [91] made the MBS simulation possible in the first place. Consequently, the proven applicability of the PCE together with the improved performance of the kernel methods on the training data shows that a significant improvement of the simulation quality of the MBS can be achieved. Nevertheless, we shortly derive the expected speedup using kernel surrogate models. Evaluating the three dimensional kernel expansion with 600 centers averages to 4×10⁻⁴s (in MatLab). The present full IVD model implementation does not have a constant run-time due to quasi-static simulations, each starting from zero input. As the different target configu-rationsX take different overall computation times, we consider the average of about 70 minutes taking 24 quasi-time steps, so about three minutes per single step. As “ideal” case (in terms of run-time), we assume each a discrete time-step in the macromodel requires only one quasi-time step in the IVD model, which already yields a speedup factor of18×10⁵. Considering real-istic simulations of 24 disks in a full MBS system over 30 seconds with time step of ∆t = 0.001s, we obtain 24 ∗30 ∗1000 = 720000 queries to f. In the above setting this would result in 600h evaluation time of the full IVD model responses compared to4.8minfor the surrogate model evaluations.

Im Dokument Model reduction for nonlinear systems : kernel methods and error estimation (Seite 128-135)