UP | HOME

MoreLinear: Hessenberg form

Table of Contents

(ns morelinear.hessenberg
  (:require [clojure.core.matrix :refer :all
             morelinear.householder :refer :all]))

Preface

For context on how to produce this decompositon, see the Householder QR decomposition. There are a lot of parallels and overlap between the two

Reduction with reflectors Form

The Householder QR decomposition has given us a great tool for expressing a linear system in a convenient orthogonal basis. The Q is our new basis and R are the coordinates of A in this basis. However if we got back to thinking about a linear system and we rewrite Ax=b in terms of the QR we get something nonsensical. In the equation QRx=b the Rx is not particularly meaningful on it's own b/c it's multiplying coordinates in one basis with a vector in the standard basis.

The QR has its uses but looking back at pages 254 - 255, it seemed that one thing we were shooting for was a similarity transform ie. a nice new basis in which we could easily perform the linear function A. We should be able to take our input vector x, change it this convenient basis, put it through our linear system, and then go back to the standard basis we started with. The QR doesn't give us that.

Again, working with reflectors, the text starting on page 350 presents a solution called the Upper-Hessenberg Form. This mirrors the Householder decomposition in a lot of ways and turns out matrix into an almost upper triangular matrix (It just has one nonzero subdiagonal). The text states that this form is much easier than finding a basis that is fully upper-triangular (this is developed on the next page 353 under the Jacobi reduction and then in Chapter 7). Basically we are doing the same thing as before, but each time we perform a reflection we are also reflecting back the output so that in effect anything we put it comes out in the same basis it started. So instead of Qk..Q2..Q1A we are doing Qk..Q2..Q1AQ1Q2 .. Qn (reflections are their own inverse, so you can view the left side of A as undoing the right side. Again we can put all the Q's into one Q matrix and rewrite this is QTAQ. Since Q is orthonormal (but non-symmetric) it's inverse is it's transpose. If you distribute out the transpose to the reflectors you will see it works out

Again, we start by looking at how to zero the first column with this new reflector setup. But now that we are shooting for the one-off-diagonal this is going to kinda look like the second column case from before:

So if

\begin{equation} Q_{1} = \begin{bmatrix} 1 & 0\\ 0 & S\\ \end{bmatrix} \end{equation}

Then we can write out Q1AQ1 as:

\begin{equation} \begin{bmatrix} 1 & 0\\ 0 & S\\ \end{bmatrix} \begin{bmatrix} A_{1,1} & A_{1,*}\\ A_{*,1} & A_{sub}\\ \end{bmatrix} \begin{bmatrix} 1 & 0\\ 0 & S\\ \end{bmatrix} \\= \begin{bmatrix} A_{1,1} & A_{1,*} S\\ SA_{*,1} & S A_{sub} S \end{bmatrix} \end{equation}

And we want to zero that off diagonal first column:

\begin{equation} \begin{bmatrix} A_{1,1} & A_{1,*} S\\ \begin{bmatrix} 1 \\ 0 \\ .. \\ 0 \end{bmatrix} & S A_{sub} S \end{bmatrix} \end{equation}

So this should look familiar. We want to pick S such that SA-,1 = [ 1 0 0 0 .. 0 ]

(defn first-subcolumn-reflector
  "Build a matrix that will reflect the first column of INPUT-MATRIX 
  on to the first elementary vector [ 1 0 0 .. 0 ]"
  [input-matrix]
  (let [dimension (dec (dimension-count input-matrix
                                        0))]
    (block-diagonal-matrix [[[1.0]]
                            (elementary-coordinate-reflector (subvector (get-column input-matrix 0)
                                                                        1
                                                                        dimension)
                                                             (get-row (identity-matrix dimension) 0))])))

Just to demonstrate the result:

(def A (mutable [[43.0 36.0 38.0 90.0]
                 [21.0 98.0 55.0 48.0]
                 [72.0 13.0 98.0 12.0]
                 [28.0 38.0 73.0 20.0]]))

(def Q (first-subcolumn-reflector A))

(pm (mmul Q A Q))
;; [[43.000  75.097 -9.666  71.463]
;;  [80.056 139.128 20.275 -11.552]
;;  [-0.000  38.239 48.433  48.865]
;;  [ 0.000  60.253 38.559  28.439]]

and then we want to we will want to call this whole business recursively on SAsubS

(defn hessenberg-reduction
  "Increase the dimension of a reflector by padding it with an identity matrix"
  [reduction-matrix input-matrix]
  (if (= 2 (row-count input-matrix))
    reduction-matrix
    (let [reflector (first-subcolumn-reflector input-matrix)]
      (do (assign! input-matrix
                   (mmul reflector
                         input-matrix
                         reflector))
          (recur (mmul reduction-matrix
                       (block-diagonal-matrix [(identity-matrix (- (row-count reduction-matrix)
                                                                   (row-count input-matrix)))
                                               reflector]))
                 (submatrix input-matrix
                            1
                            (dec (row-count input-matrix))
                            1
                            (dec (column-count input-matrix))))))))

(defn hessenberg
  "A wrapper for the real function"
  [input-matrix]
  (hessenberg-reduction (identity-matrix (row-count input-matrix))
                       input-matrix))

Again demonstrating the result:

(def A (mutable [[43.0 36.0 38.0 90.0]
                 [21.0 98.0 55.0 48.0]
                 [72.0 13.0 98.0 12.0]
                 [28.0 38.0 73.0 20.0]]))

(def Q (hessenberg A))

(pm (mmul (transpose Q) 
          [[43.0 36.0 38.0 90.0]
           [21.0 98.0 55.0 48.0]
           [72.0 13.0 98.0 12.0]
           [28.0 38.0 73.0 20.0]]
          Q))
;; [[43.000  75.097 55.158 -46.454]
;;  [80.056 139.128  1.110  23.309]
;;  [-0.000  71.363 73.732  22.504]
;;  [-0.000   0.000 32.809   3.140]]

As you can see the algorithm is almost the same as the (householder ..) one. Here the fact that A is reduced in place is a bit of an inconvenience. It's not difficult to modify things to work on a copy of A if desired..

On page 352 it claims that a Hessenberg reduction on a symmetric matrix will give us a tridiagonal matrix. We can double check this:

(def A (mutable [[43.0 36.0 38.0 90.0]
                 [36.0 98.0 55.0 48.0]
                 [38.0 55.0 98.0 12.0]
                 [90.0 48.0 12.0 20.0]]))

(def Q (hessenberg A))

(pm (mmul (transpose Q) 
          [[43.0 36.0 38.0 90.0]
           [36.0 98.0 55.0 48.0]
           [38.0 55.0 98.0 12.0]
           [90.0 48.0 12.0 20.0]]
          Q))
;; [[ 43.000 104.115  0.000  0.000]
;;  [104.115  89.863 82.131  0.000]
;;  [  0.000  82.131 73.358 20.256]
;;  [  0.000   0.000 20.256 52.779]]

Indeed it does :)