2.1.4 Sistemas Lineares — Métodos Diretos — Decomposição SVD

1. Fatoração em Valores Singulares

Toda matriz pode ser decomposta como sendo um produto de três matrizes de propriedades específicas. Quando a matriz a ser decomposta $A \in \mathbf{R}^{m \times n}$ com $m \leq n$ (ou seja, mais linhas que colunas), a decomposição em valores singulares possui a forma

$$ A = U ~
\left[ \begin{array}{c}
S \\
0
\end{array} \right]
~ V^{T} $$

onde $U$ e $V$ são matrizes ortogonais de dimensão $m \times m$ e $n \times n$ , respectivamente, e $S$ é uma matriz diagonal de ordem $n \times n$ com elementos $\sigma_i$ , $i = 1, 2, \ldots, n$ , que satisfaz

$$
\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_n \geq 0
$$

Os valores $\sigma_i \in diag(S)$ são chamados de valores singulares da matriz $A$ . Podemos definir a seguinte condição para a matriz não-singular $A$

$$ \kappa(A) = \|A\|~\|A^{-1}\| $$

como sendo $\frac{\sigma\_1}{\sigma\_n}$ .

Quando $m \leq n$ , isto é, quando o número de colunas é menor ou igual ao número de colunas, a decomposição terá a forma

$$
A = U ~ \left[ ~ S ~ ~ ~ 0 ~\right] ~ V^{T}
$$

onde, novamente, $U$ e $V$ são ortogonais e terão dimensões $m \ times m$ e $n \times n$ , respectivamente. $S$ agora será uma matriz singular $m \times m$ com a diagonal não-negativa, com elementos

$$
\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_m \geq 0
$$

Para $A$ simétrica, com $n$ autovalores reais $\lambda\_1, \lambda\_2, \ldots, \lambda\_n$ , e seus autovetores associados $q\_1, q\_2, \ldots, q\_n$ , podemos usar a decomposição espectral de $A$ :

$$
A = \sum_{i = 1}^{n} \lambda_i q_i q_i^T
$$

Assim, a decomposição pode ser reescrita, definindo-se

$$
\Lambda = diag(\lambda_1, \lambda_2, \ldots, \lambda_n)
$$

$$
Q = \left[ \begin{array}{c}
q_1 \\
q_2 \\
\cdots \\
q_n
\end{array} \right]
$$

e reescrevendo $A$ como

$$
A = Q ~ \Lambda ~ Q^T
$$

Esta decomposição é idêntica à SVD quando definimos $U = V = Q$ e $S = \Lambda$ . Note que os valores singulares $\sigma\_i$ e os autovalores $\lambda\_i$ coincidem neste caso.

No caso da norma euclidiana para matrizes

$$
\|A\|_2 = ~\text{maior autovalor de } \sqrt{\left( A^T ~ A \right)}
$$

temos que

$$
\|A\| = \sigma_1(A) = \text{ maior autovalor de } A
$$

$$
\|A^{-1}\| = \sigma_n(A) = \text{ inverso do menor autovalor de } A
$$

desta forma, temos que para todo $x \in \mathbf{R}^n$ ,

$$
\sigma_n(A)~\|x\|^2 = \dfrac{\|x\|^2}{A^{-1} \leq x^T ~ Ax \leq
\|A\| ~ \|x\|^2 = \sigma_1(A) ~ \|x\|^2
$$

Para a matriz ortogonal $Q$ , temos que a norma euclidiana

$$
\|Qx\| = \|x\|
$$

e que todos os valores singulares desta matriz será igual à 1.

1.1. Matriz pseudo-inversa de Moore

Decompondo a matriz A em três matrizes $U$ , $S$ e $V$

$$
A = U \times S \times V
$$

podemos encontrar a inversa de $A$ com SVD através da seguinte formula

$$
A^{-1} = (U ~ S ~ V)^{-1} = V^{-1} ~ S^{-1} ~ U^{-1} = V^T ~ S^{-1} ~ U^T
$$

onde $L = \frac{1}{S} \rightarrow L\_i = \frac{1}{S\_i}$ , $U^{-1} = U^T$ e $V^{-1} = V^T$ . Desta forma, podemos obter facilmente a matriz inversa de $A$ via SVD.

1.2. Resolvendo sistemas lineares com SVD

Para resolver um sistema linear, basta multiplicarmos a matriz inversa com o vetor de elementos independentes. Isto é, para o sistema linear

$$
AX = B
$$

temos que

$$
X = A^{-1} B
$$

onde $A^{-1}$ obtemos via SVD. Assim, o vetor das soluções é

$$
X = V L U^T B
$$

2. Implementação

Existem diversos algoritmos para computação das matrizes decompostas e o vetor de valores singulares. A seguir, apresentamos o código de uma implementação simples e eficiente da fatoração em valores singulares.

def SVD(arg):
    """
    Return the (S,V,D) from Singular Values Decomposition.

    SVD(A)

    INPUT:
    * A: matrix to decompose

    return: the tuple (S,V,D)

    Author: Pedro Garcia [sawp@sawp.com.br]
    see: http://www.sawp.com.br

    License: Creative Commons
             http://creativecommons.org/licenses/by-nc-nd/2.5/br/

    OBS: This is a implementation of a Modified Golub-Reinsch SVD and
         the code is adapted from the NIST Template Numerical Toolkit (TNT)
         by Roldan Pozo. see: http://math.nist.gov/tnt/

    Aug 2010
    """
    A = deepcopy(arg)
    (m, n) = (len(A), len(A[0]))
    nu = min(m, n)
    (numcols, numrows) = (min(m - 1, n), max(0, min(n - 2, m)))
    S = [[0.0 for i in xrange(nu)] for j in xrange(m)]  # singular values
    V = [0.0 for i in xrange(min(m + 1, n))]
    D = [[0.0 for i in xrange(n)] for j in xrange(n)]
    e = [0.0 for i in xrange(n)]
    sigma = [0.0 for i in xrange(m)]

    # Diagonalize the original matrix in superdiagonal V
    for k in xrange(max(numcols, numrows)):
        km_range = xrange(k, m)
        kp1_range = xrange(k + 1, n)
        if k < numcols:
            # Compute the 2-norm of k-th column for S
            V[k] = 0.0
            for i in km_range:
                V[k] = hypot(V[k], A[i][k])
            if V[k] != 0.0:
                if A[k][k] < 0.0:
                    V[k] = -V[k]
                for i in km_range:
                    A[i][k] = A[i][k] / V[k]
                A[k][k] = A[k][k] + 1.0
            V[k] = -V[k]
        for j in kp1_range:
            if (k < numcols) and (V[k] != 0.0):
                t = 0.0
                for i in km_range:
                    t = t + A[i][k] * A[i][j]
                t = -t / A[k][k]
                for i in km_range:
                    A[i][j] = A[i][j] + t * A[i][k]
                e[j] = A[k][j]
        if k < numcols:
            for i in km_range:
                S[i][k] = A[i][k]

        if k < numrows:
            # Compute the 2-norm of k-th column for D
            e[k] = 0.0
            for i in kp1_range:
                e[k] = hypot(e[k], e[i])
            if e[k] != 0.0:
                if e[k + 1] < 0.0:
                    e[k] = -e[k]
                for i in kp1_range:
                    e[i] = e[i] / e[k]
                e[k + 1] = e[k + 1] + 1.0
            e[k] = -e[k]
            if (k + 1 < m) and (e[k] != 0.0):
                for i in xrange(k + 1, m):
                    sigma[i] = 0.0
                for j in kp1_range:
                    for i in xrange(k + 1, m):
                        sigma[i] = sigma[i] + e[j] * A[i][j]
                for j in kp1_range:
                    t = -e[j] / e[k + 1]
                    for i in xrange(k + 1, m):
                        A[i][j] = A[i][j] + t * sigma[i]
            for i in kp1_range:
                D[i][k] = e[i]

    # setup bidiagonal matrix
    p = min(n, m + 1)
    if numcols < n:
        V[numcols] = A[numcols][numcols]
    if m < p:
        V[p - 1] = 0.0
    if (numrows + 1) < p:
        e[numrows] = A[numrows][p - 1]
    e[p - 1] = 0.0

    # generate S
    for j in xrange(numcols, nu):
        for i in xrange(m):
            S[i][j] = 0.0
        S[j][j] = 1.0
    for k in xrange(numcols - 1, -1, -1):
        if V[k] != 0.0:
            for j in xrange(k + 1, nu):
                t = 0.0
                for i in xrange(k, m):
                    t = t + S[i][k] * S[i][j]
                t = -t / S[k][k]
                for i in xrange(k, m):
                    S[i][j] = S[i][j] + t * S[i][k]
            for i in xrange(k, m):
                S[i][k] = -S[i][k]
            S[k][k] = 1.0 + S[k][k]
            for i in xrange(k - 1):
                S[i][k] = 0.0
        else:
            for i in xrange(m):
                S[i][k] = 0.0
            S[k][k] = 1.0

    # generate D
    for k in xrange(n - 1, -1, -1):
        if (k < numrows) and (e[k] != 0.0):
            for j in xrange(k + 1, nu):
                t = 0.0
                for i in xrange(k + 1, n):
                    t = t + D[i][k] * D[i][j]
                t = -t / D[k + 1][k]
                for i in xrange(k + 1, n):
                    D[i][j] = D[i][j] + t * D[i][k]
        for i in xrange(n):
            D[i][k] = 0.0
        D[k][k] = 1.0

    # SVD
    pp = p - 1
    itr = 0
    eps = euler ** (-64)
    while p > 0.0:
        for k in xrange(p - 2, -2, -1):
            if k == -1:
                break
            if abs(e[k]) < = eps * (abs(V[k]) + abs(V[k + 1])):
                e[k] = 0.0
                break
        if k == (p - 2):
            caso = 4
        else:
            for ks in xrange(p - 1, k - 1, -1):
                if ks == k:
                    break
                if ks != p:
                    t1 = abs(e[ks])
                else:
                    t1 = 0.0
                if ks != (k + 1):
                    t2 = abs(e[ks - 1])
                else:
                    t2 = 0.0
                t = t1 + t2
                if abs(V[ks]) <= eps * t:
                    V[ks] = 0.0
                    break
            if ks == k:
                caso = 3
            elif ks == (p - 1):
                caso = 1
            else:
                caso = 2
                k = ks
        k += 1

        if caso == 1:
            f = e[p - 2]
            e[p - 2] = 0.0
            for j in xrange(p - 2, k - 1, -1):
                t = hypot(V[j], f)
                cs = V[j] / t
                sn = f / t
                V[j] = t
                if j != k:
                    f = -sn * e[j - 1]
                    e[j - 1] = cs * e[j - 1]
                for i in xrange(n):
                    t = cs * D[i][j] + sn * D[i][p - 1]
                    D[i][p - 1] = -sn * D[i][j] + cs * D[i][p - 1]
                    D[i][j] = t
        elif caso == 2:
            f = e[k - 1]
            e[k - 1] = 0.0
            for j in xrange(k, p):
                t = hypot(V[j], f)
                cs = V[j] / t
                sn = f / t
                V[j] = t
                f = -sn * e[j]
                e[j] = cs * e[j]
                for i in xrange(m):
                    t = cs * S[i][j] + sn * S[i][k - 1]
                    S[i][k - 1] = -sn * S[i][j] + cs * S[i][k - 1]
                    S[i][j] = t
        elif caso == 3:
            # qr-elimination
            s = max(abs(V[k]), abs(e[k]), abs(e[p - 2]))
            s = max(s, abs(V[p - 2]), abs(V[p - 1]))
            memv = (V[p - 1], V[p - 2], e[p - 2], V[k], e[k])
            (sp, spm1, epm1, sk, ek) = map((lambda x: x / s), memv)
            b = ((spm1 + sp) * (spm1 - sp) + epm1 * epm1) / 2.0
            c = (sp * epm1) * (sp * epm1)
            shift = 0.0
            if (b != 0.0) or (c != 0.0):
                shift = hypot(b, c)
                if b < 0.0:
                    shift = -shift
                shift = c / (b + shift)
            f = (sk + sp) * (sk - sp) + shift
            g = sk * ek
            for j in xrange(k, p - 1):
                t = hypot(f, g)
                cs = f / t
                sn = g / t
                if j != k:
                    e[j - 1] = t
                f = cs * V[j] + sn * e[j]
                e[j] = cs * e[j] - sn * V[j]
                g = sn * V[j + 1]
                V[j + 1] = cs * V[j + 1]
                for i in xrange(n):
                    t = cs * D[i][j] + sn * D[i][j + 1]
                    D[i][j + 1] = -sn * D[i][j] + cs * D[i][j + 1]
                    D[i][j] = t
                t = hypot(f, g)
                (cs, sn) = (f / t, g / t)
                V[j] = t
                f = cs * e[j] + sn * V[j + 1]
                V[j + 1] = -sn * e[j] + cs * V[j + 1]
                g = sn * e[j + 1]
                e[j + 1] = cs * e[j + 1]
                if (j < m - 1):
                    for i in xrange(m):
                        t = cs * S[i][j] + sn * S[i][j + 1]
                        S[i][j + 1] = -sn * S[i][j] + cs * S[i][j + 1]
                        S[i][j] = t
            e[p - 2] = f
            itr = itr + 1
        elif caso == 4:
            if V[k] <= 0.0:
                if V[k] < 0.0:
                    V[k] = -V[k]
                else:
                    V[k] = 0.0
                for i in xrange(pp + 1):
                    D[i][k] = -D[i][k]
            # sort values
            while (k < pp) and (V[k] < V[k + 1]):
                (V[k], V[k + 1]) = (V[k + 1], V[k])
                if k < n - 1:
                    for i in xrange(n):
                        (D[i][k], D[i][k + 1]) = (D[i][k + 1], D[i][k])
                if k < m - 1:
                    for i in xrange(m):
                        (S[i][k], S[i][k + 1]) = (S[i][k + 1], S[i][k])
                k += 1
            itr = 0
            p -= 1
    return (S, V, D)

2.1. Utilizando a decomposição para resolver sistemas lineares

def solveSVD(A, b):
    """
    Return a list with the solution of linear system.

    solveSVD(A, b)

    INPUT:
    * A: a matrix of coefficients
    * b: a list, relative a vector of independent terms

    return: a list with unknown vector 'x' of system A*x = b.

    Author: Pedro Garcia [sawp@sawp.com.br]
    see: http://www.sawp.com.br

    License: Creative Commons
             http://creativecommons.org/licenses/by-nc-nd/2.5/br/

    Aug 2010
    """

    def transpose(M):
        """ Computes the transpose matrix of M.

        Return: M^T.
        """
        m = xrange(len(M))
        n = xrange(len(M[0]))
        p = product(m, n)
        MT = [[0.0 for i in n] for j in m]
        for i in p:
            MT[i[0]][i[1]] = M[i[1]][i[0]]
        return MT

    def matmul(A, B):
        """ Perform a matrix multiplication for two square matrices.
        """
        m = len(A)
        if type(B[0]) == list:
            C = [[0.0 for i in xrange(m)] for j in xrange(m)]
            for i in xrange(m):
                for j in xrange(m):
                    for k in xrange(m):
                        C[i][j] = C[i][j] + A[i][k] * B[k][j]
        else:
            C = [0.0 for i in xrange(m)]
            for i in xrange(m):
                for j in xrange(m):
                        C[i] = C[i] + A[i][j] * B[j]
        return C

    def singular_threshold(S):
        m = len(S)
        C = [[0.0 for i in xrange(m)] for j in xrange(m)]
        for i in xrange(m):
            C[i][i] = 1.0 / S[i]
        return C

    def solve(A, b):
        (u, s, v) = SVD(A)
        l = singular_threshold(s)
        x1 = matmul(v, l)
        x2 = matmul(transpose(u), b)
        x = matmul(x1, x2)
        return x

    # Function's body
    return solve(A, b)

Estas funções estão disponíveis em http://www.sawp.com.br/code/linear/SVD.py, assim como um exemplo de como utilizá-las.

3. Copyright

Este documento é disponível sob a licença Creative Commons. As regras dos direitos de cópia deste conteúdo estão acessíveis em http://creativecommons.org/licenses/by-nc-nd/2.5/br/.

References

[1] Anthony Ralston and Philip Rabinowitz, A First Course in Numerical Analysis (2nd ed.), McGraw-Hill and Dover, (2001), 423-424.
[2] Lars Elden, Numerical Linear Algebra and Applications in Data Mining (Preliminary version), http://www.mai.liu.se/~ laeld/, (2005).

2.1.4 Sistemas Lineares — Métodos Diretos — Decomposição SVD

1. Fatoração em Valores Singulares

1.1. Matriz pseudo-inversa de Moore </p>

1.2. Resolvendo sistemas lineares com SVD </p>

2. Implementação

2.1. Utilizando a decomposição para resolver sistemas lineares </p>

3. Copyright

References