2.3.5 Conjugate Gradient Method

Next: 2.4 Eigenvalues and Eigenvectors Up: 2.3 Iterative Methods Previous: 2.3.4 Successive Over-Relaxation (SOR)

2.3.5 Conjugate Gradient Method

Task: Solve $\mbox{${\bf A}$} \cdot \vec{x}=\vec{b}$ . This may be interpreted as a minimization problem:

Let

$\begin{displaymath} f(\vec{x}) \equiv \frac{1}{2} \, \left\vert \mbox{${\bf A}$} \cdot \vec{x} - \vec{b} \right\vert^{2} \, , \end{displaymath}$

Then the

-vector $\vec{x}$ which minimizes $f(\vec{x})$ (with the minimum value

) is the desired solution.

Minimization of a quadratic function of variables:

**Figure:** Conjugate gradients: $\vec{g}_{0}=-\vec{\nabla} f(P_{0})$ ... direction of steepest descent at $P_{0}$ , etc.; $\vec{h}_{1}$ ... direction of the gradient conjugate to $\vec{g}_{0}$ .
$\begin{figure}\includegraphics[height=180pt]{figures/f2cg.ps}\end{figure}$

Warm-up: Cauchy's Steepest Descent Method

Assume

to depend on two variables $\vec{x}=(x_{1},x_{2})$ only. The lines of equal elevation of a quadratic function are ellipses that may be very elongated (see the figure).

Start from $P_{0}$ with position vector $\vec{x}_{0}$ and follow the local gradient

$\begin{displaymath} \vec{g}_{0} = - \nabla f(P_{0}) \end{displaymath}$ (2.2)

This direction will not lead directly to the extremal point of .
Find the lowest point $P_{1}$ along this path. Determine once more the local gradient $\vec{g}_{1}= - \nabla f(P_{1})$ ; by construction it must be perpendicular to the previous direction $\vec{g}_{0}$ .
Iterate this procedure to arrive, after many mutually orthogonal bends, at the bottom of the channel (or top of the hill).

Now for the serious work: Conjugate Gradients

From point $P_{1}$ take the direction $\vec{h}_{1}$ instead of $\vec{g}_{1}$ .

How to find $\vec{h}_{1}$ ?

Require that along $\vec{h}_{1}$ the change of the gradient of has no component parallel to $\vec{g}_{0}$ . (In contrast: along $\vec{g}_{1}$ , the gradient of has initially no $\vec{g}_{0}$ -component). As a consequence, a gradient in the direction of $\vec{g}_{0}$ will not develop immediately (on quadratic surfaces never!).

$\Longrightarrow$ This requirement leads to the prescription given in the Figure.

**Figure 2.2:** The CG method
$\fbox{\begin{minipage}{480pt} {\bf Conjugate gradient technique:} \begin{enumera... ...box{$\vec{h}_{1}$} \right\vert^{2}} \end{equation}\end{enumerate}\end{minipage}}$

For dimension

this is all there is; $\vec{x} = \vec{x}_{2}$ is the solution vector. For systems of higher dimension one has to go on from $\vec{x}_{2}$ in the direction

$\begin{displaymath} \mbox{$\vec{h}_{2}$} = \mbox{$\vec{g}_{2}$} - \frac{(\mbox{$... ...\vert \mbox{${\bf A}$} \cdot \vec{h}_{1} \right\vert^{2}} \, . \end{displaymath}$

until the next ``low point'' is reached at

$\begin{displaymath} \vec{x}_{3} = \vec{x}_{2} + \lambda_{3} \, \mbox{$\vec{h}_{2}$} \, , \end{displaymath}$

with

$\begin{displaymath} \lambda_{3} = \frac{\left\vert \mbox{$\vec{g}_{2}$} \cdot \m... ...ox{${\bf A}$} \cdot \mbox{$\vec{h}_{2}$} \right\vert^{2}} \, . \end{displaymath}$

Next we determine $\vec{x}_{4}$ and so forth, until we arrive at the solution $\vec{x} = \vec{x}_{N}$ .

Advantage of CG: No matrix inversion!

But: Several multiplication $\mbox{${\bf A}$} \cdot \vec{x}$ ; therefore most efficient for sparse matrices.

EXAMPLE: To see the mechanics of the method, assume

$\begin{displaymath} \mbox{${\bf A}$}=\mbox{$\left( \begin{array}{cc}3&1\\ \vspac... ...in{array}{r} 1.2 \\ \vspace{-9 pt}\\ 0.2 \end{array} \right)$} \end{displaymath}$

The gradient vector at $\vec{x}_{0}$ is

$\begin{displaymath} \mbox{$\vec{g}_{0}$} = - \mbox{${\bf A}$}^{T} \cdot \left[ \... ...in{array}{r} 4.8 \\ \vspace{-9 pt}\\ 5.6 \end{array} \right)$} \end{displaymath}$

and

$\begin{displaymath} \lambda_{1} = \frac{\left\vert \mbox{$\vec{g}_{0}$} \right\v... ...${\bf A}$} \cdot \mbox{$\vec{g}_{0}$} \right\vert^{2}} = 0.038 \end{displaymath}$

such that

$\begin{displaymath} \vec{x}_{1}=\vec{x}_{0}+\lambda_{1}\vec{g}_{0}=\mbox{$\left(... ...ray}{r} 1.017 \\ \vspace{-9 pt}\\ -0.014 \end{array} \right)$} \end{displaymath}$

Similarly,

$\begin{displaymath} \mbox{$\vec{g}_{1}$}=\mbox{$\left( \begin{array}{r} -0.063 \... ... 0.0532 \end{array} \right)$} \, , \;\; \lambda_{2}=0.262 \, , \end{displaymath}$

and thus

$\begin{displaymath} \vec{x}_{2}= \mbox{$\left( \begin{array}{r} 1.000 \\ \vspace{-9 pt}\\ 0.000 \end{array} \right)$} \end{displaymath}$

which is the desired solution vector in this 2-dimensional case. If $\mbox{${\bf A}$}$ were a $N \times N$ matrix,

such steps would be necessary.

Next: 2.4 Eigenvalues and Eigenvectors Up: 2.3 Iterative Methods Previous: 2.3.4 Successive Over-Relaxation (SOR)

Franz J. Vesely Oct 2005
See also: "Computational Physics - An Introduction," Kluwer-Plenum 2001