Ill-Posed Inverse Problems and Regularization Approaches

2.2 Convex Regularization

2.2.1 Ill-Posed Inverse Problems and Regularization Approaches

Ill-posed Inverse Problems

The concept of a well-posed problem was introduced by J. Hadamard (1923), in an attempt to clarify what types of boundary conditions are most natural for various types for differential equations. As a result of his investigation, a problem characterized by the equation Ax = y, where x ∈ H₁, y ∈ H₂ (both H₁ and H₂ denote Hilbert spaces) and A is a bounded linear operator, is defined to well-posed provided the following condition are satisfied:

1. for every element y∈ H₂ there exists a solution in the spaceH₁; 2. the solution is unique;

3. the problem is stable on the space (H₁,H₂), which means that the solution depends con-tinuously on data.

Otherwise the problem is ill-posed. Later, the concept of well posedness in the least-squares sense has been introduced by Nashed [174]. It is according to whichAx=yis well-posed if for eachy∈ H₂there exists a unique least-squares solution (of minimal norm) which depends continuously of the data. For years, ill-posed problems have been considered as mere mathematical anomalies.

Indeed, it was believed that physical situations only lead to well-posed problems. However, this attitude was erroneous and many ill-posed problems arise in practical situations. A detailed list of the ill-posed problems arising in mathematical physics is provided in the monograph by Lavrentiev [138].

If the image formation process is modeled in a continuous infinite dimensional space, the dis-tortion operator H becomes an integral operator andg=Hf+η becomes a Fredholm integral equation of the first kind in Eq. 2.4. Then the solution is always an ill-posed problem. This means that the unique least-squares solution of minimal norm of g=Hf+η does not depend continuously on the data or a bounded perturbation (noise) occurs in the data. It results in an unbounded perturbation in the solution. This solution of the generalized inverse of blur kernel H could be unbounded [174], [127]. The integral operatorH has a countably infinite number of

2 Regularization for Image Deblurring and Denoising

Figure 2.3: ^a|b|c_d|e|f Noise is amplified during the deconvolution. (a) Original image. (b) Blurred image with salt-pepper noise (impulsive noise). (c) Deconvolved image using Richard-Lucy filter. (d)(e)(f) Zoom in images

singular values. Since the finite dimensional discrete problem of image restoration results from the discretization of an ill-posed continuous problem, the matrix H has a cluster of small sin-gular values. Clearly, the finer the discretization (the larger the size of matrixH) the closer the limit of the singular values is approximated. Therefore, although the finite dimensional inverse problem is well-posed in the least-squares sense, the ill-posedness of the continuous problem translates into an ill-conditioned matrixH. The detailed proof we refer [138].

In quantifying the conditioning of a matrix the condition number ofN(H) can be used, defined according to the inequality [271]

kek

kfk ≤ kHkkH^invk η

Hf =N(H) η

kHfk (2.7)

whereH^inv denotes the generalized inverse ofH,f is the solution of ideal noiseless image, ande denotes the error in the solution when the noisy input imagegis available. If the value ofN(H) is small, a small relative change ingcannot produce a very large relative change in f. IfN(H) has a large value, a small perturbation in the image may result in large (although bounded) perturbation in the solution, and the system is said to be ill-conditioned. By using theL² norm for vectors and matrices,N(H) takes the simplified form,

N(H) =kHk₂· kH^invk= µ1

µr

(2.8) where µ1, ..., µn are the singular values of H, r is the rank of H, and it was assumed that µ1 ≥µ2 ≥...≥µr ≥µr+1 =...=µn= 0. Since the largest singular value ofH is different from

Figure 2.4: ^a|b|c|d_e|f|g|h Noise is amplified with different blur deconvolution using Richard-Lucy filter. (a)(e) Salt-pepper noise. (b)(f) Motion blur deconvolution. (c)(g) Gaussian blur deconvolution. (d)(h) Pill-box blur deconvolution.

zero due to the assumption of lossless imaging, N(H) is an increasing function of the image dimensions.

The problem of noise amplification can be further explained by using a spectral approach. That is, the minimum norm least-squares solution of g=Hf +η can be written as

fˆ=

i=1

(v_i, Hf) wi

i=1

(v_i, η) wi

v_j (2.9)

where vi and vj are respectively the eigenvectors of HH^> and H^>H, and (vi, vj) denotes the inner product of the vectors v_i and v_j. Clearly, sinceH is an ill-conditioned matrix some of its singular values will be very close to zero, so that some of the weightsw_i⁻¹ are very large numbers.

If theith inner product (vi, η) is not zero (as is true when it is broadband), the noise of the second term is amplified. Similar observations can be made by using the spectral decomposition of an operator in infinite dimensional spaces. If matrixHis block circulant, the singular valuesw_i are equal to|H(x, y)|in Eq. 2.6, where the| · |denotes complex magnitude. Different deconvolution methods have different amplification of noise. For example, inverse filter and Wiener filter are very sensitive to noise. Richard-Lucy methods is relatively robust for noise but the noise can be still amplified. In Fig. 2.3, impulsive salt-pepper noise distributes randomly in individual pixels. It is strongly amplified during the deconvolution. Fig. 2.4 shows the deconvolution using different blur kernels. Therefore, denoising is also very important in image restoration.

Regularization Approaches

“Regularization of ill-posed problems” is a phrase used for various approaches to circumvent lack of continuous dependence. Roughly speaking, a regularization method entails an analysis of an ill-posed problem via an analysis of an associated well-posed problem, whose solution yields

2 Regularization for Image Deblurring and Denoising

meaningful answers and approximations to the given ill-posed problem. According to A. N.

Tikhonov [241], the regularization method consists of finding regularizing operators that operate on the data, and determining the regularization parameters from supplementary information pertaining to the problem. The regularization operator depends continuously on the data and results in the true solution when the regularization parameters go to zero, or equivalently when the noise goes to zero. On the other hand, in the 1970s, Vapnik et al. [248] generalized the theory of the regularization method for solving the so-called stochastic ill-posed problems. They define stochastic ill-posed problems as problems of solving operator equations in the case when approximations of the function on the right-hand side converge in probability to an unknown function and / or when the approximations to the operator converge in probability to an unknown operator. In particular, the regularization methods have been extended for solving the learning problems: estimating densities, conditional densities, and kernel based classifiers.

Numerous methods have been proposed for treating and regularizing various types of ill-posed problems. The various approaches to regularization involve essentially one or more of the fol-lowing intuitive ideas in different research streams,

1. change of the concept of a solution [126], [219], ;

2. additional information for the restriction to a compact set [105];

3. projection for the change of the space and/or topologies [126];

4. shift the spectrum for the modification of the operator[251], [290], [288], [186], [221];

5. well-posed stochastic extension, convergence with respect to L`evy-Prokhorov metric on the collection of probability measures on a given metric space: Banks-Bihari, Engl-Wakolbinger, Engl-Hofinger-Kindermann [68], [43].

The various approaches to regularization overlap in many aspects, especially in theoretical pro-gresses and practically possible solutions for ill-posed inverse problems. Most existing image restoration methods have a common estimation structure in spite of their apparent variety. The common structure is expressed by regularization theory. Such a statement can be also made for most early vision approaches [25], [194], kernel based regularization approaches [222], multilayer network learning approaches [193], [70] and so on. In most of these approaches, the underlying idea of regularization is to combine the prior information with the data information and de-fines a solution by trying to achieve smoothness and yet remain fidelity to the data. In other words, a regularized solution is a solution between the “ultra-rough” least-squares solution and an “ultra-smooth” solution based ona prioriknowledge.

From optimization point of view, the solution of regularization is to put the objective or cost (energy) functions into an optimization problem which makes the best possible choice of objective functions from a set of candidate choices. The objective or cost function might be a measure of the overall risk or variance. For example, two cases are presented,

1. In the case of image restoration, the solution of optimization corresponds to a choice that has minimum cost among all choices that meet the firm requirements, i.e., input a degraded image and output an restored image with high-fidelity to the original image.

2. In the case of blur identification, the task is to find a model, from a family of potential models, that best fits some observed data and prior information. Here the variables are the descriptive parameters in the model, the constraints or knowledge of prior information or limits on the parameters (such as nonnegativity of images in the physical world).

The objective function can be a measure of misfit or prediction error between the observed data and the values predicted by the model, or a statistical measure of the unlikeliness or implausibility of the parameter values. Thereby, the optimization problem is to find the model parameter values that are consistent with the prior information, and give the smallest misfit or prediction error with the observed data (or, in a statistic framework).

From these analysis, restoring the image f can be seen as a minimization problem [16]. The general minimization model incorporates the strengths of the various types of diffusion arising from,

J(f) = 1 2

Ω

|g−hf|²dxdy+λS(f) (2.10)

where S(f) = R

Ω|∇f|^pdxdy, for 1 ≤ p ≤ 2, λ ≥ 0, and Ω is an open bounded subset of Rⁿ (we consider n =2, or 3 dimensions), Here it denotes the support of image. The first term in Eq. (2.10) measures the fidelity to the data. The second term is a penalty smoothing term. The positive regularization parameterλcontrols the trade-off between the fidelity to the observation and smoothness of the restored image. When a magnitude of a gradient isp= 2, the Eq. (2.10) becomes aL² norm Tikhonov solution [165], [241]. TheL² norm regularization has very strong isotropic (Laplace) smoothing properties but penalizes strongly the gradients corresponding to the discontinuities and edges. In order to handle discontinuities, the issue of non-directional versus directional operator has been debated firstly by Marr and Hildreth [157], [158]. Later, some of the pioneer work in this direction was done by Rudin, Osher, and Fatemi [213], [212], who proposed to use the L¹ (p= 1) norm of the gradient off in Eq.(2.10) and called the total variational (TV) regularization. The TV method with L¹ norm encourages smoothing in the direction tangential to the edges and weakly penalize in the direction orthogonal to the edges in the space of a bounded total variation [17], [44], [258], [260].

To preserve the textures, edges and small scale details, more elegant constraints are proposed and explored by researchers like the forward-backward diffusion sharpen operator and some more detailed optimization [245]. Perona and Malik [191] replaced the classical isotropic diffu-sion (p = 2) with the values of 1 < p < 2 in general nonlinear diffusion which is effective in reconstructing piecewise smooth regions between the isotropic, anisotropic nonlinear and TV-based smoothing [259], [266], [29], [173]. To further improve the fidelity of image restoration, different integration models ofL¹ andL² norms have been explored by Chambolle, Chan as well as discontinuity-preserving and fidelity enhancement by [39], [48], [205]. Recently, Yves Meyer (2001)[164] presented an mathematical analysis of the Rudin-Osher-Fatemi model (1992) [213]

in the bounded variation (BV) space of functions. The Fourier vs. wavelet series is expansions of BV functions. He also introduced a new space which is called G space to model oscillating patterns and widely used for image structure, texture and homogenous layer decomposition.

Information theory have also been extended to regularization theory. For example, maxi-mal entropy regularization can use an entropy measure term instead of normaxi-mal L^p term, e.g., S(f) =R

Ωfln(f /m), wherem is some positive function reflecting a priori information aboutf.

2 Regularization for Image Deblurring and Denoising

Figure 2.5: a|b|c|d|e. Convex sets and convex functions. (a)(b) Convex sets. (c) Nonconvex set. (d) Convex function. (e) Strictly convex function.

Integration and combination of information, statistical learning and variational regularization have been also investigated by researchers and still be an interesting research point.

Im Dokument Bayesian Learning and Regularization for Unsupervised Image Restoration and Segmentation (Seite 31-36)