5. Rank-Deficient Systems with Inequalities
5.3 Rigorous Computation of a General Solution
x 1
x 2
0 2 4 6 8 10
0 2 4 6 8 10
200 400 600 800 1000 1200
(a) Isolines of the objective function
0 2 4 6 8 10
0 2 4 6 8 10
x 1
x 2
200 400 600 800 1000 1200
(b) Case 1b: Manifold and feasible region intersect and the manifold is restricted by the constraint
0 2 4 6 8 10
0 2 4 6 8 10
x1
x 2
200 400 600 800 1000 1200
(c)Case 2a: Manifold and feasible region are disjunct, unique solution
0 2 4 6 8 10
0 2 4 6 8 10
x1
x 2
200 400 600 800 1000 1200
(d) Case 2b: Manifold and feasible region are disjunct, but there is still a manifold of solutions due to the active parallel constraint
Figure 5.1:Isolines of the objective function of the bivariate convex optimization problem, which is stated in Sect. 5.5.1. The one-dimensional manifold (i.e., a straight line) is plotted asdashed black line. A particular solution of the unconstrained case was computed via the Moore-Penrose inverse and is shown asorange cross.
The infeasible region isshaded and inequalities are plotted asred lines. In (c) the unique ICLS solution is depicted asgreen cross. Figure taken from Roese-Koerner and Schuh (2014).
5.3. Rigorous Computation of a General Solution 67
5.3.1.3 General Remarks and Outline of the Framework
It is important to notice that the direction of the manifold cannot be changed through the intro-duction of inequality constraints. More specifically, a translation (case 2b), a general restriction (case 1b) or a dimension reduction (case 1c and 2a) of the manifold are possible, but never a rota-tion. This leaves us in the comfortable situation that it is possible to determine the homogeneous solution of an ICLS problem by determining the homogeneous solution of the corresponding uncon-strained WLS problem and reformulate the constraints in relation to this manifold. Therefore, our framework consists of the following major parts that will be explained in detail in the next sections:
To compute a general solution of an ICLS problem (3.8), we compute a general solution of the unconstrained WLS problem and perform a change of variables to reformulate the constraints in terms of the free variables of the homogeneous solution. Next, we determine if there is an intersection between the manifold of solutions and the feasible region. In case of an intersection, we determine the shortest solution vector in the nullspace of the design matrix with respect to the inequality constraints and reformulate the homogeneous solution and the inequalities accordingly. If there is no intersection, we use the modified active-set method described in Sect. 5.2.2 to compute a particular solution and determine the uniqueness of the solution by checking for active parallel constraints.
It should be noted that any QP solver that is capable of handling singular systems can be used to determine this particular solution. In earlier publications (Roese-Koerner and Schuh, 2014, 2015) Dantzig’s Simplex Algorithm for QPs was used for this purpose. However, we found that the mod-ified active-set method is less memory consuming than Dantzig’s algorithm (at least in its basic formulation) and therefore replaced the latter.
5.3.2 Transformation of Parameters
In order to solve the ICLS problem (3.8), as a first step, we compute a general solution of the unconstrained WLS problem (2.32) as described in Sect. 2.1.3.2
xeWLS(λ) =xWLSP +xhom(λ) =xWLSP +Xhom λ. (5.7) Afterwards, we perform a change of variables and insert (5.7) and (2.13) into the inequality con-straints
BTx≤b
to reformulate them in terms of the free variables λat the point of the particular solution BT xWLSP +Xhom λ
≤b, (5.8a)
leading to
BTxWLSP +BTXhom λ≤b. (5.8b)
With the substitutions
BTλ:=BTXhom, (5.9)
bλ:=b−BTxWLSP , (5.10)
(5.8b) reads
BTλλ≤bλ, (5.11)
being the desired formulation of the inequality constraints. If the extended matrix [BTλ|bλ] has rows that contain only zeros, these rows can be deleted from the equation system, as they belong to inactive inequality constraints which are parallel to the solution manifold but do not shift the optimal solution (cf. case 1a in Sect. 5.3.1.1).
For some applications, it is possible to directly state a basis Xhom for the nullspace of A. This offers the advantage that the free parametersλare interpretable. For example in the adjustment of a geodetic direction and distance network, the free parameters represent translation, rotation and scaling of the network and an explicite basis for the nullspace is known (cf. Meissl, 1969).
For an horizontal network, where there is a known basis for the nullspace of the design matrix, Xu (1997) worked out constraints (such as a positive scaling parameter) for the free parameters to obtain a geodetically meaningful solution. These constraints can easily be included in our framework by using the known basis of the nullspace as homogeneous solutionXhomand expanding the constraints BTλ and bλ to the free parameters.
Next, we examine if the constraints (5.11) form a feasible set or if they are inconsistent. This can be done by formulating and solving a feasibility problem (cf. Boyd and Vandenberghe, 2004, p. 579–
580). If the constraints form a feasible set, there is at least one set of free parametersλi that fulfills all constraints. This is equivalent to the statement that there is an intersection between the manifold of solutions and the feasible region of the original ICLS problem which is described in Sect. 5.3.3. If the constraints are inconsistent, the manifold and the feasible region are disjoint and we will proceed as described in Sect. 5.3.4.
5.3.3 Case 1: Intersection of Manifold and Feasible Region
If there is an intersection of manifold and feasible region, we aim for the determination of a unique particular solution xeP that fulfills certain predefined optimality criteria. Therefore, a second opti-mization problem is introduced (e.g., minimizing the length of the solution vector with respect to the norm LP). As this minimization takes place in the nullspace of the design matrixA, the value of the objective function of the original problem (3.8) does not change
Nullspace optimization problem objective function: ΦNS xICLSP (λ)
. . . Min constraints: BTλλ≤bλ
optim. variable: λ∈IRd.
(5.12)
The minimization yields optimal free parameters λ, whose insertion in (5.7) results ine
xeICLSP =xWLSP +Xhomλ,e (5.13)
which is a unique particular solution that fulfills all constraints. As the reformulated constraints depend on the chosen particular solution, we have to adapt them to the new particular solution using (5.8). Now, we can combine our results to a rigorous general solution of the ICLS problem (3.8)
xeICLS(λ) =xeICLSP +xhom(λ), (5.14a)
BTλλ≤bλ. (5.14b)
In the following, the influence of the use of different norms and functional relationships of the objective function ΦNSof the nullspace optimization problem (5.12) is discussed.
5.3. Rigorous Computation of a General Solution 69
5.3.3.1 Nullspace Optimization
As stated above, the whole minimization takes place in the nullspace of the design matrix and the value of the original objective function (3.8) will not change. Nonetheless, choosing a suitable second objective function for a particular problem can be helpful to achieve properties such as sparsity of the parameter vector. If this second objective function is chosen in a way that the nullspace optimization problem becomes a QP, it can for example be solved using the active-set method described in Sect. 3.5.1. In case the problem is an LP or a general non-linear problem, specialized solvers (e.g., CVX, see Grant and Boyd, 2014) have to be used.
Two different parts of the objective function can be distinguished, which will be examined in some detail: the functional relationship and the norm, with respect to which the minimization (or maxi-mization) shall be performed.
Functional Relationship. The functional relationship strongly depends on the application. How-ever, there are two main concepts which are applicable to a large variety of problems.
First, the length of the parameter vector x can be minimized. This can for example be beneficial if not absolute coordinates but coordinate differences are estimated, which should be close to the initial coordinates. In a Bayesian interpretation this could be seen as introduction of isotropic prior information on the parameters.
More sophisticated approaches include a weighted minimization of the length of the parameter vector. For example if prior knowledge about the magnitude of the parameters is given. Prominent examples are Kaula’s rule of thumb in gravity field estimation or the demand for a decay of the amplitudes of higher frequencies in signal processing to achieve a square integrable function. A mathematical example for the minimization of the length of the parameter vector is provided in Sect. 5.5.1.2.
The other main concept is to maximize the distance of the solution to the constraints
||BTx(λ)−b||. . .max, (5.15a)
thus
||BTxP+BTXhomλ−b||. . .max. (5.15b)
This could be beneficial, if the constraints constitute a kind of outermost threshold. In Sect. 7.3.2 an engineering application including welding tolerances is described, in which this type offunctional relationship is applied. One of the main advantages of the use of inequality constraints is that they do not influence the result, if they are not active. Therefore, it it usually not possible to provide a buffer to the boundary of the feasible set without losing estimation quality (shown by an increased value of the original objective function). However, due to the optimization in the nullspace of A, we are in the unique position to apply such a buffer without deteriorating the estimate.
Norms. In the following, we will point out some general aspects of different norms and their influence on the most classical functional relationship: the length of the solution vector. Lp norms used in adjustment theory include the L1 norm, the L2 norm and theL∞ norm (cf. Sect. 2.3 and Jäger et al., 2005, Boyd and Vandenberghe, 2004, p. 125–128 and p. 635, respectively).
Most often, the length of a vector is minimized with respect to theL2norm (also known as Euclidean norm), which results in the nullspace objective function
ΦL2NS xICLSP (λ)
=||xICLSP (λ)||2 (5.16a)
= xICLSP (λ)T
xICLSP (λ). (5.16b)
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5 0 0.5 1 1.5
L1 L2 L∞
Figure 5.2:Two-dimensionalunit spheres ofL1(orange),L2(blue) andL∞norm (green). TheL1diamond evolves from the summation ofxandyvalue, while for theL∞square, only the value of the largest quantity is decisive. Figure taken from Roese-Koerner and Schuh (2015).
Inserting (5.7) yields
ΦL2NS(λ) = xICLSP +Xhom λT
xICLSP +Xhom λ
(5.16c)
=λTXThomXhomλ+ 2 xICLSP T
Xhomλ+ xICLSP T
xICLSP (5.16d)
and the following quadratic program:
L2 norm nullspace optimization problem
objective function: ΦL2NS(λ) =λTXThomXhomλ+ 2 xICLSP T
Xhomλ . . . Min constraints: BTλλ≤bλ
optim. variable: λ∈IRd.
(5.17)
The last term of objective function (5.16) is constant and was neglected as it does not influence the minimization result. In case of solely inactive constraints, the resulting solution will be equivalent to the one obtained via the Moore-Penrose inverse in the unconstrained case. If the constraints prevent that the optimal unconstrained solution is reached, our particular solution will be the one with shortest length among all solutions that minimize the objective function and fulfill the constraints. The procedure could therefore be seen as a kind of pseudoinverse that takes inequality constraints into account.
Using the L2 norm is a quite natural choice which can easily be visualized geometrically. However, as the influence of one element on the norm decreases if its absolute value becomes smaller, a minimization with respect to theL2 norm will seldom result in a sparse vector. This can be verified by examining the blue L2 norm unit sphere depicted in Fig. 5.2.
The naive choice to achieve maximal sparsity of a vector would be a minimization with respect to the L0 norm (e.g., the number of nonzero elements). However, a minimization with respect to the L0 norm is a combinatorical problem and computationally very demanding. Therefore, e.g., Candes