Linear(-ized) Inverse Problems

(1)

Linear inverse problems - Formulation

- Some Linear Algebra

- Matrix calculation – Revision

- Illustration under(over)determined, unique case - Examples

Linearized inverse problems - Formulation

- Examples Partial derivatives

Scope: Formulate linear inverse problems as a system of equations in matrix form. Find the conditions under which solutions exist. Understand how to linearize a non-linear system to be able to find solutions.

(2)

Literature

Stein and Wysession: Introduction to seismology, Chapter 7

Aki and Richards: Theoretical Seismology (1s edition) Chapter 12.3

Shearer: Introduction to seismology, Chapter 5

Menke, Discrete Inverse Problems

http://www.ldeo.columbia.edu/users/menke/gdadi t/index.htm

Full ppt files and matlab routines

(3)

Formulation

Linear(-ized) inverse problems can be formulated in the following way:

j ij

i G m

d 

(summation convention applies) i=1,2,...,N number of data

j=1,2,...,M number of model parameters

G_ij known (mxn)

We observe:

- The inverse problem has a unique solution if N=M and det(G)≠0, i.e.

the data are linearly independent

- the problem is overdetermined if N>M - the problem is underdetermined if M>N

(4)

Illustration – Unique Case

In this case N=M, and det(G) ≠0. Let us consider an example

2 1

2

2 1

1

4 2

2 3

1

m m

d

m m

d









Let us check the determinant of this system: det(G)=10

Gm d 



 







 



 



 





2 1 2

1

4 1

2 3

m m d

d

d G m

Gm G

d

G^-1  ^-1   ^-1



 



 



 







 







 







 



 





5 . 0 0

3 . 0 1

. 0

2 . 0 4

. 0

2 1

2 1 2

1

m m

d d m

m

(5)

Illustration – Overdetermined Case

In this case N>M, there are more data than model parameters.

Let us consider examples with M=2, an overdetermined system would exist if N=3.

2 1

3

2 2

1 1

2 2 1

m m

d

m d





A physical experiment which could result in these data:

Individual Weight measurement of two masses m₁and m₂ leading to the data d₁and d₂and weighing both together leads to d₃. In matrix form:



 





















 















2 1 2

1

1 1

1 0

0 1

m m d

d d

Gm d 

(6)

Let us consider this problem graphically

A common way to solve this problem is to minimize the difference between data vector d and the predicted data for some model m such that

is minimal.

2 1

2 2 1

m m





Gm 2

d  S 

(7)

Using the L₂-norm leads us to the least-squares formulation of the problem. The solution to the

minimization (and thus the inverse problem) is given as:

In our example the resulting (best) model estimation is:

d G G) (G

m~  ^T ^¹ ^T



 



 

3 / 5

3 /

~ 2 m

and is the model with the minimal distance to all three lines in the plot.

best model

(8)

Illustration – Underdetermined Case

Let us assume we made one measurement of the combined weight of two masses:

Clearly there are infinitely many solutions to this problem. A model

estimate can be defined by choosing a model that fits the data exactly Am=d and has the smallest l₂norm ||m||. Using Lagrange multipliers one can show that the minimum norm solution is given by

2 2

1  m  d  m



 



 

 ^

1

~ 1

)

~ ( ¹

m

d GG

G

m ^T ^T

(9)

Examples – Inversion of Gravity Data

Let’s go back to the problem of gravity, in 2-D the Bouguer anomaly

at point x₀with arbitrary topography is given by (e.g. Telford et al., 1990)

To bring this into the form d=Gm we discretize the space

 

  

 dxdz

z x

x

z z x x

d ₂ ₂

0 0

) , 2 (

)

(  

d(x₀) r(x,z)

x,x₀ z

j=1 2 3 4 5

6 7 8 9 ...

... 20 h

h

z_j

x_j

m_j

j M

j

G

j j

i

j

i m

z x

x

z d h

ij



  



1 2 2

2

) (

2



 



 





(10)

Master-Event-Method

Let s assume we have have previously located an earthquake (x₀,y₀,z₀) at time t₀ and we recorded a new event at stations 1, ..., N

1 2

3

Dt_i

x

z y

Event 2 Event 1

L_i u_i g_i

û ^x û ^y û ^z

u L t L

iz iy

ix

i i

i







 









 



 

1

cos

This is a system of linear equations for 4 unknowns:

(11)

Master-Event-Method

x

z y

Event 2 Event 1

L_i u_i

g_i





iz i

iy i

ix i

i

i i

G u G u

G u G

z m

y m

x m

m

t d

















4 3

2 1

4 3

2 1

1

Let us put this system into the common form d=Gm

































 



















z y x u u

u

u u u

t t

Ny Nz Nx

y z x

d N

i





 1

1 ¹ ¹ ¹

1











(12)

Vertical Seismic Profile

Let us consider a string of receivers in a borehole v₁

v₂

v_M

- We assume straigt rays

- The ground is discretized with

M layers of equal thickness dz with velocities v_i

-The seismometers (N) are located at depths z_j

Formulate the forward problem in matrix form d=Gm! Is the problem linear? What would happen if the rays are not modelled as straight lines?

seismometers

(13)

Linearized Inversion

Let us formalize the situation where we are able to linearize a otherwise nonlinear problem around some model m₀. In this case the forward problem is given by

N i

m m

m F

d_i  ( ₁, ₂,..., _M ) 1,

this m-dimensional function is developed around some model m₀=(m₀₁, m₀₂, ..., m_0M) where we neglect higher-order terms:

) )(

,..., ,

( )

,..., ,

( ₀₁ ₀₂ ₀ ₀

1 0

02

01 m m m m m

m m F

m m

F

d ^M _M _j

j j

i M

i

i 



 







d₀ G_ij Dm_j

d₀ synthetic data of starting model (known)

d_i=d_i-d₀ data difference vector (residuals, misfit, cost ...)

m_j=m_j-m₀ model difference vector (gradient)

(14)

Linearized Inversion: Hypocenter location

Above a homogeneous half space we measure P wave travel times

from an earthquake that happens at time t at (x,y,z) at i receiver

locations (x_i, y_{i ,}z_i). So our model vector is m=(t,x,y,z)^T. The arrival times are given by

this is a nonlinear problem! Now let us assume we have a rough idea about the time of the earthquake and its location. This is our starting model m₀= (t₀, x₀, y₀, z₀)^T.



⁽ ⁾² ⁽ ⁾² ²



¹^/²

) 1

(m t x x y y z

F

t_i  _i   _i   _i  



To linearize the problem we now have to find the partial derivatives of F with respect to all model parameters at m₀.

(15)

Hypocenter location – partial derivatives

... we obtain : ^ti  ^Fi⁽^m⁾  ^t  ¹ ⁽^xi  ^x⁾² ⁽^yi  ^y⁾² ⁽^zi  ^z⁾²¹^/²



 

0 0 0

4

0 0 0

3

0 0 0

2

0 1

2 / 2 1 0 2

0 2

0 0

) (

) 1 (

) (

) 1 (

) (

i i

i i i

i

i i

R z z

m G F

R y y

y m G F

R x x

x m G F

t m G F

z y

y x

x t

m F t



 

 

 

 

 

 

 

 

 

 













R_i0

(16)

Hypocenter location – partial derivatives

... let us now define a vector

u_i=1/R_i0(x_i-x₀, y_i-y₀,-z₀)

which is a vector pointing from the initial source location to receiver i. We obtain:



⁽ ⁾ ⁽ ⁾ ⁽ ⁾



1

0 0

0 t t u x x u y y u z z

t

t_i  _i    _ix   _iy   _iz 



d_i m1 m2 m3 m₄

which is exactly the form we obtained for the

Master-Event Method, what is the difference, however?

This approach is an iterative algorithm

(17)

Linearized Travel-Time Inversion

We learned in seismology that for a given ray parameter p the delay time t(p) is given by the difference of the travel time T and the distance X(p) the ray ermerges times p

 

 ^ ^







) (

0

2 / 2 1 2( )

2 ) ( )

( )

(

p z_s

dz p

z c p

pX p

T

 p

Graphically this can be interpreted as:

p=dT/dX

X



^pX



^T

t(p)

(18)

Linearized Travel-Time Inversion

… the important property of t(p) is the fact that it decreases

monotonically with increasing p so it is a function easier to handle than the travel-times (which may contain triplications).

t(p) is nonlinearly related to the velocity model c(z). So in order to invert for it we would have to

linearize. We obtain

 

 ^ ^



) (

0

2 / 2 1 2( )

2 ) (

p z_s

dz p

z c

 p

Now the perturbation in t(p) (the data residual) is linearly related to the perturbation in the velocity model c(z). This integral can

easily be brought into the form d=Gm by subdividing the Earth into layers (e.g. of equal thickness).

p=dT/dX

X



^pX



^T

t(p)

 

 _



) (

0

1/2 2

2 ( )

) ( c 2 c(z) )

(

p z_s

dz z p c

p z 



(19)

Partial Derivatives

Let us take a closer look at the matrix G_ijfor linearized problems. What useful information is contained in this matrix (operator)? When d=g(m), then the linearization leads to

The actual (relative) values of G_ij determine how the model parameters influence the data (or data difference).

Example: G_ik are small for all i. This implies that the model

Parameter m_khas almost no influence on the data. It can be varied Without changing them. Therefore, its resolution is poor.

m G

d  



And the matrix G_ij contains the partial derivatives

j ij i

m G g



 

(20)

Resolution – Hypocenter Location

Example: Earthquake hypocenter location



⁽ ⁾ ⁽ ⁾ ⁽ ⁾



1

0 0

0 t t u x x u y y u z z

t

t_i  _i    _ix   _iy   _iz 



d_i m1 m2 m3 m₄

Remember the elements of G_ij where the components of the unit vector which points from the original (known) hypocenter to the receiver. Small u_iz with respect to the other ones means bad resolution in depth:

The depth resolution of shallow earthquakes Far away is poor.

(21)

Linear Dependence

il

ik cG

G 

When two columns of G_ij are linearly dependent then for all i

What are the consequences for a model perturbation in parameters k and l?

Linear dependence implies

k

l m

m   c 

 1

l il k

ik l

l il k

k

ik m m G m m G m G m

G (   ) (   )^! 

In words: Parameters m_k and m_l cannot be independently determined as they compensate each other. This is called a trade-off.

(22)

Trade-Off Fault Zone Waves

(23)

Calculating Partial Derivatives (1)

) ,...,

,

( ₁₀ ₂₀ _M₀

j

ij i m m m

m G F



 

Generally we need to calculate the partial derivatives

… depending on the formulation of the forward problem …

1. For explicit functions F_i,for example:

j M

j

G

j j

i

j

i m

z x

x

z d h

ij

  



1 2 2

2

) (

2



 



 



 ^ti   ûix^x ûiy ^y ûiz^z

 ¹

Gravity problem Master-Event Method

… we can directly calculate the partial derivatives.

(24)

2. The data d_i are given implicitly through

0 )

,..., ,

,

( _i ₁ ₂ _M 

i d m m m

f

The arguments being the model and data parameter of the

starting model m₀. Often the data d₀ are obtained by finding the roots of the topmost equation.

we differentiate with respect to m_j

 0



 



j i i

i j

i

m d d

f m

f

i i j

i j

i

ij d

f m

G d



 

 

  /

(25)

3. In more complicated cases the partial derivatives have to be obtained by numerical differentiation.

Note that for the evaluation of each element of G_ij a solution

of the forward problem is necessary! In cases where the number of model parameters is large or where the forward problem is very involved this is impractical. But at least this method always works (approximately).

j

i j

j i

ij m

d m

m G d







 (..., ⁰  ,...) ⁰

(26)

Summary

Most inverse problems can be formulated as discrete linear problems either as

… or – if the problem is linearized - …

j ij

i G m

d 

j ij

i G m

d  



In which case the G_ij contains the partial derivatives of the problem. The elements of G_ijcontain useful information on the resolution of the model parameters and linear dependence may indicate trade-offs between model parameters.