next up previous
Next: 2.4 Eigenvalues and Eigenvectors Up: 2.3 Iterative Methods Previous: 2.3.4 Successive Over-Relaxation (SOR)


2.3.5 Conjugate Gradient Method

Task: Solve $\mbox{${\bf A}$} \cdot \vec{x}=\vec{b}$. This may be interpreted as a minimization problem:

Let

\begin{displaymath}
f(\vec{x}) \equiv \frac{1}{2} \, \left\vert \mbox{${\bf A}$} \cdot \vec{x} - \vec{b}
\right\vert^{2} \, ,
\end{displaymath}

Then the $N$-vector $\vec{x}$ which minimizes $f(\vec{x})$ (with the minimum value $f=0$) is the desired solution.



Minimization of a quadratic function of $N$ variables:

Figure: Conjugate gradients: $\vec{g}_{0}=-\vec{\nabla} f(P_{0})$ ... direction of steepest descent at $P_{0}$, etc.; $\vec{h}_{1}$ ... direction of the gradient conjugate to $\vec{g}_{0}$.
\begin{figure}\includegraphics[height=180pt]{figures/f2cg.ps}\end{figure}


Warm-up: Cauchy's Steepest Descent Method

Assume $f$ to depend on two variables $\vec{x}=(x_{1},x_{2})$ only. The lines of equal elevation of a quadratic function are ellipses that may be very elongated (see the figure).



Now for the serious work: Conjugate Gradients

From point $P_{1}$ take the direction $\vec{h}_{1}$ instead of $\vec{g}_{1}$.

How to find $\vec{h}_{1}$?

Require that along $\vec{h}_{1}$ the change of the gradient of $f$ has no component parallel to $\vec{g}_{0}$. (In contrast: along $\vec{g}_{1}$, the gradient of $f$ has initially no $\vec{g}_{0}$-component). As a consequence, a gradient in the direction of $\vec{g}_{0}$ will not develop immediately (on quadratic surfaces never!).

$\Longrightarrow$This requirement leads to the prescription given in the Figure.
Figure 2.2: The CG method
\fbox{\begin{minipage}{480pt}
{\bf Conjugate gradient technique:}
\begin{enumera...
...box{$\vec{h}_{1}$} \right\vert^{2}}
\end{equation}\end{enumerate}\end{minipage}}


For dimension $N=2$ this is all there is; $\vec{x} = \vec{x}_{2}$ is the solution vector. For systems of higher dimension one has to go on from $\vec{x}_{2}$ in the direction

\begin{displaymath}
\mbox{$\vec{h}_{2}$} = \mbox{$\vec{g}_{2}$} - \frac{(\mbox{$...
...\vert \mbox{${\bf A}$} \cdot \vec{h}_{1} \right\vert^{2}}
\, .
\end{displaymath}

until the next ``low point'' is reached at

\begin{displaymath}
\vec{x}_{3} = \vec{x}_{2} + \lambda_{3} \, \mbox{$\vec{h}_{2}$} \, ,
\end{displaymath}

with

\begin{displaymath}
\lambda_{3} = \frac{\left\vert \mbox{$\vec{g}_{2}$} \cdot \m...
...ox{${\bf A}$} \cdot \mbox{$\vec{h}_{2}$} \right\vert^{2}} \, .
\end{displaymath}

Next we determine $\vec{x}_{4}$ and so forth, until we arrive at the solution $\vec{x} = \vec{x}_{N}$.

Advantage of CG: No matrix inversion!

But: Several multiplication $\mbox{${\bf A}$} \cdot \vec{x}$; therefore most efficient for sparse matrices.

EXAMPLE: To see the mechanics of the method, assume

\begin{displaymath}
\mbox{${\bf A}$}=\mbox{$\left( \begin{array}{cc}3&1\\ \vspac...
...in{array}{r} 1.2 \\ \vspace{-9 pt}\\ 0.2 \end{array} \right)$}
\end{displaymath}

The gradient vector at $\vec{x}_{0}$ is

\begin{displaymath}
\mbox{$\vec{g}_{0}$} = - \mbox{${\bf A}$}^{T} \cdot \left[ \...
...in{array}{r} 4.8 \\ \vspace{-9 pt}\\ 5.6 \end{array} \right)$}
\end{displaymath}

and

\begin{displaymath}
\lambda_{1} = \frac{\left\vert \mbox{$\vec{g}_{0}$} \right\v...
...${\bf A}$} \cdot \mbox{$\vec{g}_{0}$} \right\vert^{2}} = 0.038
\end{displaymath}

such that

\begin{displaymath}
\vec{x}_{1}=\vec{x}_{0}+\lambda_{1}\vec{g}_{0}=\mbox{$\left(...
...ray}{r} 1.017 \\ \vspace{-9 pt}\\ -0.014 \end{array} \right)$}
\end{displaymath}

Similarly,

\begin{displaymath}
\mbox{$\vec{g}_{1}$}=\mbox{$\left( \begin{array}{r} -0.063 \...
... 0.0532 \end{array} \right)$} \, , \;\;
\lambda_{2}=0.262 \, ,
\end{displaymath}

and thus

\begin{displaymath}
\vec{x}_{2}= \mbox{$\left( \begin{array}{r} 1.000 \\ \vspace{-9 pt}\\ 0.000 \end{array} \right)$}
\end{displaymath}

which is the desired solution vector in this 2-dimensional case. If $\mbox{${\bf A}$}$ were a $N \times N$ matrix, $N$ such steps would be necessary.

next up previous
Next: 2.4 Eigenvalues and Eigenvectors Up: 2.3 Iterative Methods Previous: 2.3.4 Successive Over-Relaxation (SOR)
Franz J. Vesely Oct 2005
See also:
"Computational Physics - An Introduction," Kluwer-Plenum 2001