• Keine Ergebnisse gefunden

Surrogate Modelling and Uncertainty Quantification in Computational Sciences

N/A
N/A
Protected

Academic year: 2021

Aktie "Surrogate Modelling and Uncertainty Quantification in Computational Sciences"

Copied!
49
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Research Collection

Presentation

Surrogate Modelling and Uncertainty Quantification in Computational Sciences

Author(s):

Sudret, Bruno Publication Date:

2020-08-27 Permanent Link:

https://doi.org/10.3929/ethz-b-000469598

Rights / License:

In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library

(2)

Surrogate Modelling and Uncertainty Quantification in Computational Sciences

Bruno Sudret

Chair of Risk, Safety and Uncertainty Quantification, ETH Zurich

(3)

Introduction

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 1 / 40

(4)

What is a computational model?

Complex natural or engineering systems are investigated / designed and assessed usingcomputational models, a.k.asimulators

A computational model combines:

Amathematical descriptionof the physical phenomena (governing equations),e.g.mechanics, electromagnetism, fluid dynamics, etc.

divσ+f=0 σ=D·ε

ε= 1

2 ∇u+T∇u

Discretization techniqueswhich transform continuous equations into linear algebra problems

Algorithms tosolvethe discretized equations

(5)

Why do we use computational models?

Tobetter understandphysical phenomena,i.e.test theories and assumptions against real-world observations

Model calibration

Toanswer “what if?” questions: vary parameters within some ranges and see what happens Parametric study

To find outimportant parametersthat drive the model predictions

Sensitivity analysis

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 3 / 40

(6)

Why do we use computational models (in engineering)?

Toexplore the design spaceby creating virtual prototypes Model exploration

Tooptimizethe system’s performance (e.g.minimize its mass while ensuring certain behaviour)

Optimization

To assess itsrobustnessw.r.t uncertainties in the environmental & usage conditions

Uncertainty quantification / reliability

(7)

What about computational costs?

Computer power has grown tremendously over the last decades (GigaFlopsTeraFlopsPetaFlops...)

Modellers already use the available power fora single run e.g. “virtual universe simulation” by Teyssier et al.:

80 hours on 4,000+GPU nodes

Piz Daint Super Computer

Cosmic web (Image: J. Stadel)

How to carry out a parametric study / model exploration with:

Costly simulators

Complex input/output (nonlinear) behaviour

High-dimensional input space

Surrogate models

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 5 / 40

(8)

Outline

Surrogate models

Basics of uncertainty quantification

Polynomial chaos expansions Principle

Computing the coefficients

Applications Subsurface flow

Machine learning benchmarks

(9)

Surrogate models

Input X

Output Y =M(X) Computational model

M

Asurrogate modelM˜ is anapproximationof the original computational modelMwith the following features:

It is built from alimitedset of runs of the original modelMcalled theexperimental design X =

x(i), i= 1, . . . , n that yield the model responsesY=

y(i)=M x(i)

, i= 1, . . . , n

It assumes some regularity of the modelMand some generalfunctional shape

It isfast to evaluate

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 6 / 40

(10)

Surrogate models: examples

Name Shape Parameters

Polynomial chaos expansions M(x) =˜ X

α∈A

aαΨα(x) aα

Kriging (a.k.a Gaussian processes) M(x) =˜ βT·f(x) +Z(x, ω) β, σZ2,θ

Support vector machines M(x) =˜

m

X

i=1

aiK(xi,x) +b a, b

Neural networks M(x) =˜ f2(b2+f1(b1+w1·x)·w2) w,b

Low-rank tensor approximations M(x) =˜

R

X

l=1

bl M

Y

i=1

vl(i)(xi)

!

bl, z(i)k,l

(11)

Ingredients for building a surrogate model

Select anexperimental designXthat covers at best the domain of input parameters: Latin hypercube sampling (LHS), low-discrepancy sequences

Run the computational modelMontoX

Smartly post-process the data{X,M(X)}through alearning algorithm

Name Learning method

Polynomial chaos expansions sparse grid integration, least-squares, compressive sensing Low-rank tensor approximations alternate least squares

Kriging maximum likelihood, Bayesian inference

Support vector machines quadratic programming

Validatethe surrogate model,e.g.estimate a global errorε=E h

M(X)M(X)˜ 2i

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 8 / 40

(12)

Wait, isn’t it machine learning?

I. Goodfellow, Y. Bengio, A. Courville,Deep learning, MIT Press (2017)

Machine learningaims at makingpredictionsby building a model based on data

Unsupervised learningaims at discovering a hidden structure within unlabelled data x(i), i= 1, . . . , n

Supervised learningconsiders atraining data set:

X =

(x(i), y(i)), i= 1, . . . , n

where:

x(i)’s are theattributes/ features (input space) y(i)’s are thelabels(output space)

(13)

Wait, isn’t it machine learning?

Classification

Inclassificationproblems, the labels are discrete,e.g.y(i)∈ {−1,1}. The goal is topredict the classof a new pointx

Logistic regression - Support vector machines - (Deep) neural networks

Regression

Inregressionproblems, the labels are continuous, sayy(i)∈ DY R. The goal is topredict the valueyˆ= ˜M(x)for a new pointx

Neural networks - Gaussian process models - Support vector regression

0 5 10 15

−15

−10

−5 0 5 10 15

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 10 / 40

(14)

Bridging supervised learning and surrogate modelling

Features Supervised learning Surrogate modelling

Computational modelM

7 4

Input spaceXfX

7 4

Training data:X ={(xi, yi), i= 1, . . . , n}

4 4

Training data set Experimental design

(big data) (small data)

Prediction goal: for a newx∈ X/ ,y(x)?

m

X

i=1

yiK(xi,x) +b

Validation (resp. cross-validation)

4 4

Validation set Leave-one-out CV

(15)

Advantages of surrogate models

Usage

M(x) M(x)˜

hours per run seconds for106runs

Advantages

Non-intrusive methods: based on runs of the computational model

Suited to high performance computing:

“embarrassingly parallel”

Similarities withbig data analysis

Challenges

Need for rigorousvalidation

Communication: advanced mathematical background

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 12 / 40

(16)

Outline

Surrogate models

Basics of uncertainty quantification Polynomial chaos expansions Applications

(17)

Global framework for uncertainty quantification

Step A

Model(s) of the system Assessment criteria

Step B

Quantification of sources of uncertainty

Step C

Uncertainty propagation

Random variables Computational model Moments

Probability of failure Response PDF

Step C’

Sensitivity analysis

Step C’

Sensitivity analysis

B. Sudret,Uncertainty propagation and sensitivity analysis in mechanical models – contributions to structural reliability and stochastic spectral methods (2007)

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 13 / 40

(18)

Step B: Quantification of the sources of uncertainty

Goal:represent the uncertain parameters based on theavailable

data and information Probabilistic modelfX

Experimental data is available

What is thedistributionof each parameter ?

What is thedependence structure? Copula theory

0 2 4 6

0 2 4 6 8 10

0 100 200 300 400

Data Normal LN Gamma

?

No data is available: expert judgment

Engineering knowledge (e.g.reasonable bounds and uniform distributions)

Statistical arguments and literature (e.g.extreme value distributions for climatic events)

Scarce data + expert informa- tion

Bayesian statistics

(19)

Step C: uncertainty propagation

Goal:estimate the uncertainty / variability of thequantities of interest(QoI)Y =M(X)due to the input uncertaintyfX

Output statistics,i.e.mean, standard deviation, etc.

µY =EX[M(X)]

σY2 =EX

(M(X)µY)2

Mean/std.

deviation

µ σ

Distributionof the QoI

Response PDF

Probability of exceedingan admissible thresholdyadm

Pf =P(Y yadm)

Probability of

failure Pf

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 15 / 40

(20)

Uncertainty propagation using Monte Carlo simulation

Principle

Generatevirtual realizationsof the system usingrandom numbers

A sample setX={x1, . . . ,xn}is drawn according to the input distributionfX

For each sample the quantity of interest (resp. performance criterion) is evaluated, say Y={M(x1), . . . ,M(xn)}

The set of model outputs is used for moments-, distribution-, quantile- or reliability analysis

(21)

Uncertainty propagation using Monte Carlo simulation

••

X1

• • X2

• •

• •

• •

X3

Computational model

Y

•• •

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 17 / 40

(22)

Advantages/Drawbacks of Monte Carlo simulation

Advantages

Universal method: only rely uponsampling random numbers and running repeatedly the computational model

Sound statistical foundations: convergence whenn→ ∞

Suited toHigh Performance Computing:

“embarrassingly parallel”

Drawbacks

Statistical uncertainty: results are not exactly reproducible when a new analysis is carried out (handled by computingconfidence intervals)

Low efficiency: convergence raten−1/2

Monte Carlo for reliability analysis

To computePf= 10−kwith an accuracy of±10% (coef. of variation of 5%),4·10k+2runs are required

Need for surrogate models !

(23)

Outline

Surrogate models

Basics of uncertainty quantification

Polynomial chaos expansions Principle

Computing the coefficients

Applications

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 18 / 40

(24)

Polynomial chaos expansions in a nutshell

Ghanem & Spanos (1991); Sudret & Der Kiureghian (2000) Xiu & Karniadakis (2002); Soize & Ghanem (2004)

InputXwith given PDFfX(x) =QM

i=1fXi(xi)(dimX=M)

OutputY =M(X)cast as the following polynomial chaos expansion:

Y = X

α∈NM

yαΨα(X)

where :

Ψα(X): basisfunctions

yα: coefficientsto be computed (coordinates)

PCE basis

Ψα(X),αNM made ofmultivariate orthonormal polynomials Ψα(x)def=

M

Y

i=1

Ψ(i)αi(xi)

(25)

Multivariate polynomial basis

Univariate polynomials

For each input variableXi, univariate orthogonal polynomials{Pk(i), kN}are built:

D

Pj(i), Pk(i)E

= Z

Pj(i)(u)Pk(i)(u)fXi(u)du= γj(i)δjk

e.g.,Legendre polynomialsifXi∼ U(−1,1),Hermite polynomialsifXi∼ N(0,1)

Normalization:Ψ(i)j =Pj(i)/ q

γj(i) i= 1, . . . , M, jN Tensor product construction

Ψα(x)def=

M

Y

i=1

Ψ(i)αi(xi) Eα(X)Ψβ(X)] =δαβ

whereα= (α1, . . . , αM)are multi-indices (partial degree in each dimension)

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 20 / 40

(26)

Multivariate polynomial basisM = 2

α= [3,3] Ψ(3,3)(x) = ˜P3(x1)·He˜3(x2)

X1∼ U(−1,1): Legendrepolynomials

X2∼ N(0,1): Hermitepolynomials

(27)

Outline

Surrogate models

Basics of uncertainty quantification

Polynomial chaos expansions Principle

Computing the coefficients

Applications

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 21 / 40

(28)

Computing the coefficients by least-square minimization

Isukapalli (1999); Berveiller, Sudret & Lemaire (2006)

Principle

The exact (infinite) series expansion is considered as the sum of atruncated seriesand aresidual:

Y =M(X) =X

α∈A

yαΨα(X) +εP YTΨ(X) +εP(X)

where : Y={yα,α∈ A} ≡ {y0, . . . , yP−1} (P unknown coefficients) Ψ(x) =0(x), . . . ,ΨP−1(x)}

Least-square minimization

The unknown coefficients are estimated by minimizing themean square residual error:

Yˆ= arg minE h

YTΨ(X)− M(X)2i

(29)

Discrete (ordinary) least-square minimization

Yˆ= arg min

Y∈RP

1 n

n

X

i=1

YTΨ(x(i))− M(x(i))2 Procedure

Select a truncation scheme,e.g.AM,p=

αNM : |α|1p

Select anexperimental designand evaluate the model response M=

M(x(1)), . . . ,M(x(n)) T

Compute the experimental matrix Aij= Ψj x(i)

i= 1, . . . , n; j= 0, . . . , P1

Solve the resultinglinear system

Yˆ= (ATA)−1ATM Simple is beautiful !

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 23 / 40

(30)

Validation: error estimators

In least-squares analysis, thegeneralization erroris defined as:

Egen=E h

M(X)− MPC(X)2i

MPC(X) =X

α∈A

yαΨα(X)

Leave-one-out cross validation

From statistical learning theory,model validationshall be carried out usingindependent data

LOO cross-validation for PCE emulates it using all data at once ELOO= 1

n

n

X

i=1

M(x(i))− MP C(x(i)) 1hi

2

wherehiis thei-th diagonal term of matrixA(ATA)−1AT,Aij= Ψj(x(i))

x(i)

(31)

Outline

Surrogate models

Basics of uncertainty quantification Polynomial chaos expansions

Applications Subsurface flow

Machine learning benchmarks

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 24 / 40

(32)

Example: sensitivity analysis in hydrogeology

Source: http://www.futura-sciences.com/

Source: http://lexpansion.lexpress.fr/

When assessing anuclear waste repository, the Mean Lifetime Expectancy MLE(x) is the time required for a molecule of water at point x to get out of the boundaries of the system

Computational models have numerous input parameters (in each geological layer) that aredifficult to measure, and that show scattering

(33)

Geological model Joint work with University of Neuchâtel

Deman, Konakli, Sudret, Kerrou, Perrochet & Benabderrahmane, Reliab. Eng. Sys. Safety (2016)

Two-dimensional idealized modelof the Paris Basin (25 km long / 1,040 m depth) with5×5m mesh (106elements)

Steady-state flowsimulation with Dirichlet boundary conditions:

∇ ·(K· ∇H) = 0

15 homogeneous layerswith uncertainties in:

Porosity (resp. hydraulic conductivity)

Anisotropy of the layer properties (inc. dispersivity) Boundary conditions (hydraulic gradients)

78 input parameters

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 26 / 40

(34)

Sensitivity analysis

10−12 10−10 10−8 10−6 10−4 10−2

T D1 D2 D3 D4 C1 C2 C3ab L1a L1b L2a L2b L2c K1K2 K3

Kbx[m/s]

Geometry of the layers Conductivity of the layers

Question

What are the parameters (out of 78) whose uncertainty drives the uncertainty of the prediction of the mean life-time expectancy?

(35)

Sensitivity analysis: results

Technique:Sobol’indicescomputed from polynomial chaos expansions

0.01 0.2 0.4 0.6 0.8

φD4 φC3ab φL1b φL1a φC1 ∇H2φL2a φD1 AD4K AC3aba Total Sobol’ Indices

SToti

Parameter P

jSj φ(resp.Kx) 0.8664

AK 0.0088

θ 0.0029

αL 0.0076

Aα 0.0000

∇H 0.0057

Conclusions

Only200 model runsallow us to detect the 10 important parameters out of 78

Uncertainty in the porosity/conductivity of5 layersexplain 86% of the variability

Small interactions between parameters detected

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 28 / 40

(36)

Bonus: univariate effects

Theunivariate effectsof each variable are obtained as a straightforward post-processing of the PCE

Mi(xi)def=E[M(X)|Xi=xi], i= 1, . . . , M

0.05 0.1 0.15

−5 0 5

x 104

φD4 MPCE i

0.08 0.1 0.12

−5 0 5

x 104

φC3ab

0.14 0.16 0.18

−5 0 5

x 104

φL1b

0.1 0.15 0.2

−5 0 5

x 104

φL1a MPCE i

0.02 0.04 0.06

−5 0 5

x 104

φC1

(37)

Outline

Surrogate models

Basics of uncertainty quantification Polynomial chaos expansions

Applications Subsurface flow

Machine learning benchmarks

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 29 / 40

(38)

Combined cycle power plant (CCPP)

Data set UC Irvine Machine Learning Repository

9,568 data points

4 features:

- TemperatureT[1.81,37.11]C

- Exhaust vacuum in the steam turbineV [25.36,81.56]cm Hg - Ambient pressureP[992.89,1033.30]mB

- Relative humidity in the gas turbineRH[25.56100.16]%

Output:net hourly electrical energy outputEP [420.26,495.76]MW

Reference approach Tüfekci, P. (2014),Int. J. Elec. Power & Energy Systems

13 ML techniques includingregression trees, ANNandSVR

10 pairs of training / validation sets of size 4,784

Best approach:bagging reduced error pruning (BREP) regression tree

(39)

CCPP: Training data (X-space)

-10 0 10 20 30 40 50

0 200 400 600 800 1000

20 30 40 50 60 70 80

0 500 1000 1500 2000

98099010001010102010301040 0

200 400 600 800 1000 1200 1400

0 20 40 60 80 100 120

0 200 400 600 800 1000

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 31 / 40

(40)

CCPP: Results

Relative mean absolute error M AE= 1 nval

X

(x,y)∈Xval

|y− MPC(x)|

MAE min. MAE mean-min rMAE (%)

aPCEonX 3.11±0.03 3.05 0.06 0.68±0.007

BREP-NN 3.22±n.a. 2.82 0.40 n.a.

Tüfekciet al.(2014)

420 440 460 480 500

e(MWh)

0 0.01 0.02 0.03

fE(e)

Estimated PDF of the energy produced by the CCPP:

Histogram of raw data

PDF obtained by PCE (10 diff.

training sets) for input dependencies modelled by C-vines

(41)

Airfoil

Data set UC Irvine Machine Learning Repository

750 training points, 750 validation points

41 features:

Frequency, in Hertz Angle of attack, in degrees Chord length, in meters

Free-stream velocity, in meters per second.

Suction side displacement thickness, in meters 36 noise variables (standard normal)

Output:Scaled sound pressure level, in decibels

Reference approach K. Kandasamy & Y. Yu, ICML16 Proc. of the 33rd Int. Conf. on Machine Learning (2016)

Sparse LASSO regression (SALSA)

Beats 13 other regression models, incl. neural networks

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 33 / 40

(42)

Airfoil: Results

(Relative) mean absolute error (MAE)

MAE (dB) rMAE (%) aPCEonX 3.04±0.07 2.4±0.06 SALSA 3.81±0.06 3.1±0.04

Kandasamy & Yu (2016)

(43)

Conclusions

Surrogate modelsare unavoidable when dealing with costly computational models for uncertainty quantification, sensitivity analysis or optimization

Depending on the analysis, specific surrogates are most suitable,e.g.polynomial chaos expansions for distribution- and sensitivity analysis,Krigingfor reliability analysis

All these techniques arenon-intrusive: they rely on experimental designs, the size of which is a user’s choice

They areversatile,general-purposeandfield-independent

All the presented algorithms are available in the general-purpose uncertainty quantification software UQLab

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 35 / 40

(44)

www.uqlab.com

(45)

UQLab features

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 37 / 40

(46)

UQLab: The Uncertainty Quantification Software http://www.uqlab.com

ETH license:

+ free access to academia + yearly fee for non-academic usage

2,900+ registered users

1,280 active users from 87 countries

About 37% license renewal after one year

Country # Users United States 493

China 365

France 301

Switzerland 238

Germany 221

United Kingdom 134

Italy 110

Brazil 96

India 88

Canada 77

As of August 24, 2020

(47)

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 39 / 40

(48)

UQWorld: the community of UQ https://uqworld.org/

(49)

Questions ?

Chair of Risk, Safety & Uncertainty Quantification www.rsuq.ethz.ch

The Uncertainty Quantification Software

www.uqlab.com

Thank you very much for your attention !

Surrogate modelling & UQ Luzern – August 27, 2020 B. Sudret 40 / 40

Referenzen

ÄHNLICHE DOKUMENTE

Yet  in  some  cases,  stochastic  description  gives  qualitatively  different  results. •  swapping  between  two

Black: ideal dynamic trajectory, red: dynamics integrated by forward Euler algorithm Right side: integration time steps are half of left side -> smaller error.. Example:

Black: ideal dynamic trajectory, red: dynamics integrated by forward Euler algorithm Right side: integration time steps are half of left side -> smaller error.. Example:

Black: ideal dynamic trajectory, red: dynamics integrated by forward Euler algorithm Right side: integration time steps are half of left side -> smaller error.. Example:

Prom Table 7 it appears that the parent distribution influences the convergence rate of marginal distributions of eigenvalues 1 1 to normal distribution H(I^, 5jj*ii^' By

In the next simulation (c.f. Table 2), we use the analytical solution for the computations in MCS, Stochastic Collocations and Galerkin methods, thus eliminating the error caused by

The ensemble of parameter vectors is then used for the simulation of a multitude of future systems behaviour patterns, so that the uncertainty in the initial data

Thus, the problems 1,2 and 3 discussed above introduce us to the problem of the synthesis of water resource systems to be robust with respect to the deterministic, unknown