• Keine Ergebnisse gefunden

In cases where both n and the dimension ofΨare high, advanced iterative solvers, suitable for multiple right-hand sides, must be employed [156, 157]. These imply an added computational burden which scales sublinearly with the dimension of Ψ.

In the incompressible case of the solid mechanics problem, pressure must be taken into account. For that purpose the pressure trial solutions p∈ L2(Ω0) and weighting functions q ∈L2(Ω0) should also be introduced [146].

2.4 Outlook

In the previous subsection, the fundamentals of continuum and numerical mechanics are outlined and used to build a forward problem. Afterwards, an inverse problem is formulated to resolve unknown quantities based on observations.

Although many different elastography techniques exist, underlying uncertainties, for example, introduced by noise and incomplete observations, are usually neglected.

Recently published research incorporates the underlying uncertainties. A Bayesian es-timation of material parameters is included to model a non-rigid image registration more accurately [111]. However, only six material parameters and their uncertainties are derived, assuming constant properties over different regions. In [114, 108] more material parameters and their uncertainties are estimated for a linear elastic material model. Nevertheless, the unknowns are derived by sampling, which is computationally very expensive for many unknowns.

In the following chapter, we develop an uncertainty quantification method to solve high-dimensional nonlinear inverse problems. Then, within the application of strain elastography - usually a high-dimensional problem - we not only quantify the underlying mechanical properties but furthermore account for their uncertainties. We especially focus on reducing the number of dimensions as well as computational cost in order to make derived solution strategies achievable for large systems. In addition, we propose methods for quantifying model errors. Although we use elastography as our application example, these methods are easily transferable to other problems of interest.

Chapter 3

Sparse Variational Bayesian

approximations for nonlinear inverse problems

[...] the statistician knows [...] that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world.

George Box, 1919-2013 [158].

This chapter is based on the publication: I. M. Franck, P.S. Koutsourelakis, Sparse Variational Bayesian approximations for nonlinear inverse problems: Applications in nonlinear elastography, Computer Methods in Applied Mechanics and Engineering, Volume 299, 1 February 2016, Pages 215-244 [159]

3.1 Problem description and introduction

3.1 Problem description and introduction

Inference methods, such as Monte Carlo or Variational Bayesian strategies, become usually problematic with an increasing number of unknowns and problem-dimensionality.

In such problems, dimensionality reduction plays a pivotal role, more specifically on the identification of lower-dimensional features that provide the strongest signature to the unknowns and the corresponding posterior. Discovering a sparse set of features has attracted great interest in many applications, such as in the representation of natural images [160] or more generally in signal processing applications. A host of algorithms have been developed for finding such representations and also appropriate dictionaries for achieving this goal [161, 162, 163, 164]. While all these tools are pertinent to the present problem they differ in a fundamental way. They are based on several data/observations/instantiations of the vector that we seek to represent.

However, in our problems we do not have such direct observations, i.e., the available data pertains to the output of a model which is nonlinearly and implicitly dependent on the vector of unknowns. Furthermore, we are primarily interested in approximating the posterior of this vector rather than simply performing dimensionality reduction. We demonstrate how this can be done by using a fully Bayesian formulation and employing the marginal likelihood or evidence as the ultimate model validation metric for any proposed dimensionality reduction.

Let the vector Ψ∈RdΨ represent any model parameters for which a model output y(Ψ) ∈ Rdy is available (forward run) and the calibration of the model is of inter-est. We also presuppose the availability of the derivatives with respect to the model parameters ∂Ψ∂y. For problems of practical interest, it is assumed that the dimension dΨ of the unknowns is very large which poses a significant hindrance in finding proper regularization (in deterministic settings [165]) or in specifying appropriate priors (in probabilistic settings [166, 167]). The primary focus of the Bayesian model developed in this section is two-fold:

• Find lower-dimensional representations of the unknown parameter vector Ψthat capture as much as possible of the associated posterior density.

• Enable the computation of the posterior density with as few forward calls (i.e., evaluations of y(Ψ),∂Ψ∂y) as possible.

We denote yˆ ∈ Rdy the vector of observations/measurements. In the context of elastography the observations are displacements (in the static case) and/or velocities (in dynamics). The extraction of this data from images (ultrasound or MRI) is a challenging topic that requires sophisticated image registration techniques [168, 169]. Naturally, this compromises the informational content of the raw data (i.e., the images). In this study, we ignore the error introduced by the image registration process, as the emphasis

is on the inversion of the continuum mechanics, PDE-based model and assume that the displacement data are contaminated with noise.

We postulate the presence of i.i.d. Gaussian noise, denoted here by the random vector z ∈Rdy, such that:

ˆ

y=y(Ψ) +z, z ∼ N(0, τ−1Idy). (3.1) N(z|0, τ−1Idy) denotes a multivariate normal distribution of z with a mean 0 and a covariance of τ−1Idy. Often a short-hand notation, skipping for a simplified notation the random variable which obeys the normal distribution, is used: N(0, τ−1Idy). We assume that each entry of z has zero mean and an unknown variance τ−1, which will also be inferred from the data. We note that other models can also be employed, such as impulsive noise to account for outliers due to instrument calibration (e.g., to account for faulty sensors) or experimental conditions [170]. Generally, the difference between observed and model-predicted outputs can be attributed not only to observation errors (noise), but also to model discrepancies arising from the discretization of the governing equations. Another source of error can be an inadequacy of the model, which captures the underlying physical process, itself. While the former source can be reduced by considering very fine discretizations (at the cost of increasing the dimensionality of the state vectoruand potentiallyΨ), the latter requires a much more thorough treatment [46, 171, 172, 100, 108, 173, 174], on which we focus on in Chapter 5. Within this chapter such model errors are lumped with observation errors in the z-term.

The likelihood function of the observed data y, i.e., its conditional probabilityˆ density given the model parametersΨ(and implicitly the modelMitself, as described by Equation (2.33) and the resulting y(Ψ)) and τ is:

p(y|Ψ, τˆ ) = τ 2π

dy/2

eτ2||ˆy−y(Ψ)||2. (3.2) In the Bayesian advocated framework, one also needs to specify priors on the unknown parameters. We defer a detailed discussion of the priors associated withΨfor the next section where the dimensionality reduction aspects are discussed. With regard to the noise precision τ we employ a (conditionally) conjugate Gamma prior [175], i.e.,

pτ(τ) = Gamma(a0, b0). (3.3)

The values of the parameters are taken a0 = b0 = 0 in the following examples. This corresponds to a limiting case where the density degenerates to an improper, non-informative Jeffreys prior, i.e., pτ(τ) ∝ 1τ that is scale invariant [48]. Naturally, more informative choices can be made if such information is available a priori.